mariadb/storage/maria/ma_recovery.c

3687 lines
118 KiB
C
Raw Normal View History

WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* Copyright (C) 2006, 2007 MySQL AB
Fix for LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery More DBUG_PRINT (to simplify future debugging) Aria: Added STATE_IN_REPAIR, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Aria: Some trivial speedup optimization Aria: Better warning if table was marked crashed by unfinnished repair mysql-test/lib/v1/mysql-test-run.pl: Fix so one can run RQG mysql-test/suite/maria/r/maria-recovery2.result: Update for new error message. mysys/stacktrace.c: Fixed compiler warning storage/maria/ha_maria.cc: More DBUG_PRINT Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Don't log query for dropping temporary table. storage/maria/ha_maria.h: Added prototype for drop_table() storage/maria/ma_blockrec.c: More DBUG_PRINT Make read_long_data() inline for most cases. (Trivial speedup optimization) storage/maria/ma_check.c: Better warning if table was marked crashed by unfinnished repair storage/maria/ma_open.c: More DBUG_PRINT storage/maria/ma_recovery.c: Give warning if found crashed table. Changed warning for tables that can't be opened. storage/maria/ma_recovery_util.c: Write warnings to DBUG file storage/maria/maria_chk.c: Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. storage/maria/maria_def.h: Added maria_mark_in_repair(x) storage/maria/maria_read_log.c: Added option: --character-sets-dir storage/maria/trnman.c: By default set min_read_from to max value. This allows us to remove TRN:s from rows during recovery to get more space. This fixes bug LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery
2010-07-30 09:45:27 +02:00
Copyright (C) 2010 Monty Program Ab
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
GPL license update (same change as was done for all files in 5.1). storage/maria/Makefile.am: GPL license update storage/maria/ft_maria.c: GPL license update storage/maria/ha_maria.cc: GPL license update storage/maria/ha_maria.h: GPL license update storage/maria/lockman.c: GPL license update storage/maria/lockman.h: GPL license update storage/maria/ma_bitmap.c: GPL license update storage/maria/ma_blockrec.c: GPL license update storage/maria/ma_blockrec.h: GPL license update storage/maria/ma_cache.c: GPL license update storage/maria/ma_changed.c: GPL license update storage/maria/ma_check.c: GPL license update storage/maria/ma_checkpoint.c: GPL license update storage/maria/ma_checkpoint.h: GPL license update storage/maria/ma_checksum.c: GPL license update storage/maria/ma_close.c: GPL license update storage/maria/ma_control_file.c: GPL license update storage/maria/ma_control_file.h: GPL license update storage/maria/ma_create.c: GPL license update storage/maria/ma_dbug.c: GPL license update storage/maria/ma_delete.c: GPL license update storage/maria/ma_delete_all.c: GPL license update storage/maria/ma_delete_table.c: GPL license update storage/maria/ma_dynrec.c: GPL license update storage/maria/ma_extra.c: GPL license update storage/maria/ma_ft_boolean_search.c: GPL license update storage/maria/ma_ft_eval.c: GPL license update storage/maria/ma_ft_eval.h: GPL license update storage/maria/ma_ft_nlq_search.c: GPL license update storage/maria/ma_ft_parser.c: GPL license update storage/maria/ma_ft_stem.c: GPL license update storage/maria/ma_ft_test1.c: GPL license update storage/maria/ma_ft_test1.h: GPL license update storage/maria/ma_ft_update.c: GPL license update storage/maria/ma_ftdefs.h: GPL license update storage/maria/ma_fulltext.h: GPL license update storage/maria/ma_info.c: GPL license update storage/maria/ma_init.c: GPL license update storage/maria/ma_key.c: GPL license update storage/maria/ma_keycache.c: GPL license update storage/maria/ma_least_recently_dirtied.c: GPL license update storage/maria/ma_least_recently_dirtied.h: GPL license update storage/maria/ma_locking.c: GPL license update storage/maria/ma_open.c: GPL license update storage/maria/ma_packrec.c: GPL license update storage/maria/ma_page.c: GPL license update storage/maria/ma_panic.c: GPL license update storage/maria/ma_preload.c: GPL license update storage/maria/ma_range.c: GPL license update storage/maria/ma_recovery.c: GPL license update storage/maria/ma_recovery.h: GPL license update storage/maria/ma_rename.c: GPL license update storage/maria/ma_rfirst.c: GPL license update storage/maria/ma_rkey.c: GPL license update storage/maria/ma_rlast.c: GPL license update storage/maria/ma_rnext.c: GPL license update storage/maria/ma_rnext_same.c: GPL license update storage/maria/ma_rprev.c: GPL license update storage/maria/ma_rrnd.c: GPL license update storage/maria/ma_rsame.c: GPL license update storage/maria/ma_rsamepos.c: GPL license update storage/maria/ma_rt_index.c: GPL license update storage/maria/ma_rt_index.h: GPL license update storage/maria/ma_rt_key.c: GPL license update storage/maria/ma_rt_key.h: GPL license update storage/maria/ma_rt_mbr.c: GPL license update storage/maria/ma_rt_mbr.h: GPL license update storage/maria/ma_rt_split.c: GPL license update storage/maria/ma_rt_test.c: GPL license update storage/maria/ma_scan.c: GPL license update storage/maria/ma_search.c: GPL license update storage/maria/ma_sort.c: GPL license update storage/maria/ma_sp_defs.h: GPL license update storage/maria/ma_sp_key.c: GPL license update storage/maria/ma_sp_test.c: GPL license update storage/maria/ma_static.c: GPL license update storage/maria/ma_statrec.c: GPL license update storage/maria/ma_test1.c: GPL license update storage/maria/ma_test2.c: GPL license update storage/maria/ma_test3.c: GPL license update storage/maria/ma_unique.c: GPL license update storage/maria/ma_update.c: GPL license update storage/maria/ma_write.c: GPL license update storage/maria/maria_chk.c: GPL license update storage/maria/maria_def.h: GPL license update storage/maria/maria_ftdump.c: GPL license update storage/maria/maria_pack.c: GPL license update storage/maria/tablockman.c: GPL license update storage/maria/tablockman.h: GPL license update storage/maria/trnman.c: GPL license update storage/maria/trnman.h: GPL license update
2007-03-02 11:20:23 +01:00
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
/*
WL#3072 Maria recovery
First version written by Guilhem Bichot on 2006-04-27.
*/
/* Here is the implementation of this module */
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
#include "maria_def.h"
#include "ma_recovery.h"
#include "ma_blockrec.h"
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
#include "ma_checkpoint.h"
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
#include "trnman.h"
#include "ma_key_recover.h"
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
#include "ma_recovery_util.h"
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
struct st_trn_for_recovery /* used only in the REDO phase */
{
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
LSN group_start_lsn, undo_lsn, first_undo_lsn;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
TrID long_trid;
};
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
struct st_table_for_recovery /* used in the REDO and UNDO phase */
{
MARIA_HA *info;
};
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* Variables used by all functions of this module. Ok as single-threaded */
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
static struct st_trn_for_recovery *all_active_trans;
static struct st_table_for_recovery *all_tables;
static struct st_dirty_page *dirty_pages_pool;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
static LSN current_group_end_lsn;
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
#ifndef DBUG_OFF
/** Current group of REDOs is about this table and only this one */
static MARIA_HA *current_group_table;
#endif
static TrID max_long_trid= 0; /**< max long trid seen by REDO phase */
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
static my_bool skip_DDLs; /**< if REDO phase should skip DDL records */
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
/** @brief to avoid writing a checkpoint if recovery did nothing. */
static my_bool checkpoint_useful;
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
static my_bool in_redo_phase;
static my_bool trns_created;
static ulong skipped_undo_phase;
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
static ulonglong now; /**< for tracking execution time of phases */
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
static int (*save_error_handler_hook)(uint, const char *,myf);
static uint recovery_warnings; /**< count of warnings */
Fix for LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery More DBUG_PRINT (to simplify future debugging) Aria: Added STATE_IN_REPAIR, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Aria: Some trivial speedup optimization Aria: Better warning if table was marked crashed by unfinnished repair mysql-test/lib/v1/mysql-test-run.pl: Fix so one can run RQG mysql-test/suite/maria/r/maria-recovery2.result: Update for new error message. mysys/stacktrace.c: Fixed compiler warning storage/maria/ha_maria.cc: More DBUG_PRINT Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Don't log query for dropping temporary table. storage/maria/ha_maria.h: Added prototype for drop_table() storage/maria/ma_blockrec.c: More DBUG_PRINT Make read_long_data() inline for most cases. (Trivial speedup optimization) storage/maria/ma_check.c: Better warning if table was marked crashed by unfinnished repair storage/maria/ma_open.c: More DBUG_PRINT storage/maria/ma_recovery.c: Give warning if found crashed table. Changed warning for tables that can't be opened. storage/maria/ma_recovery_util.c: Write warnings to DBUG file storage/maria/maria_chk.c: Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. storage/maria/maria_def.h: Added maria_mark_in_repair(x) storage/maria/maria_read_log.c: Added option: --character-sets-dir storage/maria/trnman.c: By default set min_read_from to max value. This allows us to remove TRN:s from rows during recovery to get more space. This fixes bug LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery
2010-07-30 09:45:27 +02:00
static uint recovery_found_crashed_tables;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
#define prototype_redo_exec_hook(R) \
static int exec_REDO_LOGREC_ ## R(const TRANSLOG_HEADER_BUFFER *rec)
#define prototype_redo_exec_hook_dummy(R) \
static int exec_REDO_LOGREC_ ## R(const TRANSLOG_HEADER_BUFFER *rec \
2007-12-19 00:20:25 +01:00
__attribute__ ((unused)))
#define prototype_undo_exec_hook(R) \
static int exec_UNDO_LOGREC_ ## R(const TRANSLOG_HEADER_BUFFER *rec, TRN *trn)
prototype_redo_exec_hook(LONG_TRANSACTION_ID);
prototype_redo_exec_hook_dummy(CHECKPOINT);
prototype_redo_exec_hook(REDO_CREATE_TABLE);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
prototype_redo_exec_hook(REDO_RENAME_TABLE);
prototype_redo_exec_hook(REDO_REPAIR_TABLE);
prototype_redo_exec_hook(REDO_DROP_TABLE);
prototype_redo_exec_hook(FILE_ID);
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
prototype_redo_exec_hook(INCOMPLETE_LOG);
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
prototype_redo_exec_hook_dummy(INCOMPLETE_GROUP);
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
prototype_redo_exec_hook(UNDO_BULK_INSERT);
Fix for BUG#37876 "Importing Maria table from other server via binary copy does not work": - after auto-zerofill (ha_maria::check_and_repair()) kepts its state's LSNs unchanged, which could be the same as the create_rename_lsn of another pre-existing table, which would break versioning as this LSN serves as unique identifier in the versioning code (in maria_open()). Even the state pieces which maria_zerofill() did change were lost (because they didn't go to disk). - after this fix, if two tables were auto-zerofilled at the same time (by _ma_mark_changed()) they could receive the same create_rename_lsn, which would break versioning again. Fix is to write a log record each time a table is imported. - Print state's LSNs (create_rename_lsn, is_of_horizon, skip_redo_lsn) and UUID in maria_chk -dvv. mysql-test/r/maria-autozerofill.result: result mysql-test/t/maria-autozerofill.test: Test for auto-zerofilling storage/maria/ha_maria.cc: The state changes done by auto-zerofilling never reached disk. storage/maria/ma_check.c: When zerofilling a table, including its pages' LSNs, new state LSNs are needed next time the table is imported into a Maria instance. storage/maria/ma_create.c: Write LOGREC_IMPORTED_TABLE when importing a table. This is informative and ensures that the table gets a unique create_rename_lsn even though multiple tables are imported by concurrent threads (it advances the log's end LSN). storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: instead of using translog_get_horizon() for state's LSNs of imported table, use the LSN of to-be-written LOGREC_IMPORTED_TABLE. storage/maria/ma_loghandler.c: New type of log record storage/maria/ma_loghandler.h: New type of log record storage/maria/ma_loghandler_lsn.h: New name for constant as can be used not only by maria_chk but auto-zerofill now too. storage/maria/ma_open.c: instead of using translog_get_horizon() for state's LSNs of imported table, use the LSN of to-be-written LOGREC_IMPORTED_TABLE. storage/maria/ma_recovery.c: print content of LOGREC_IMPORTED_TABLE in maria_read_log. storage/maria/maria_chk.c: print info about LSNs of the table's state, and UUID, when maria_chk -dvv storage/maria/maria_pack.c: new name for constant storage/maria/unittest/ma_test_recovery.pl: Now that maria_chk -dvv shows state LSNs and UUID those need to be filtered out, as maria_read_log -a does not use the same as at original run.
2008-07-09 11:02:27 +02:00
prototype_redo_exec_hook(IMPORTED_TABLE);
prototype_redo_exec_hook(REDO_INSERT_ROW_HEAD);
prototype_redo_exec_hook(REDO_INSERT_ROW_TAIL);
prototype_redo_exec_hook(REDO_INSERT_ROW_HEAD);
prototype_redo_exec_hook(REDO_PURGE_ROW_HEAD);
prototype_redo_exec_hook(REDO_PURGE_ROW_TAIL);
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
prototype_redo_exec_hook(REDO_FREE_HEAD_OR_TAIL);
prototype_redo_exec_hook(REDO_FREE_BLOCKS);
prototype_redo_exec_hook(REDO_DELETE_ALL);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_redo_exec_hook(REDO_INDEX);
prototype_redo_exec_hook(REDO_INDEX_NEW_PAGE);
prototype_redo_exec_hook(REDO_INDEX_FREE_PAGE);
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
prototype_redo_exec_hook(REDO_BITMAP_NEW_PAGE);
prototype_redo_exec_hook(UNDO_ROW_INSERT);
prototype_redo_exec_hook(UNDO_ROW_DELETE);
prototype_redo_exec_hook(UNDO_ROW_UPDATE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_redo_exec_hook(UNDO_KEY_INSERT);
prototype_redo_exec_hook(UNDO_KEY_DELETE);
prototype_redo_exec_hook(UNDO_KEY_DELETE_WITH_ROOT);
prototype_redo_exec_hook(COMMIT);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
prototype_redo_exec_hook(CLR_END);
prototype_redo_exec_hook(DEBUG_INFO);
prototype_undo_exec_hook(UNDO_ROW_INSERT);
prototype_undo_exec_hook(UNDO_ROW_DELETE);
prototype_undo_exec_hook(UNDO_ROW_UPDATE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_undo_exec_hook(UNDO_KEY_INSERT);
prototype_undo_exec_hook(UNDO_KEY_DELETE);
prototype_undo_exec_hook(UNDO_KEY_DELETE_WITH_ROOT);
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
prototype_undo_exec_hook(UNDO_BULK_INSERT);
static int run_redo_phase(LSN lsn, LSN end_lsn,
enum maria_apply_log_way apply);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
static uint end_of_redo_phase(my_bool prepare_for_undo_phase);
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
static int run_undo_phase(uint uncommitted);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
static void display_record_position(const LOG_DESC *log_desc,
const TRANSLOG_HEADER_BUFFER *rec,
uint number);
static int display_and_apply_record(const LOG_DESC *log_desc,
const TRANSLOG_HEADER_BUFFER *rec);
static MARIA_HA *get_MARIA_HA_from_REDO_record(const
TRANSLOG_HEADER_BUFFER *rec);
static MARIA_HA *get_MARIA_HA_from_UNDO_record(const
TRANSLOG_HEADER_BUFFER *rec);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
static void prepare_table_for_close(MARIA_HA *info, TRANSLOG_ADDRESS horizon);
static LSN parse_checkpoint_record(LSN lsn);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
static void new_transaction(uint16 sid, TrID long_id, LSN undo_lsn,
LSN first_undo_lsn);
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
static int new_table(uint16 sid, const char *name, LSN lsn_of_file_id);
static int new_page(uint32 fileid, pgcache_page_no_t pageid, LSN rec_lsn,
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
struct st_dirty_page *dirty_page);
static int close_all_tables(void);
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
static my_bool close_one_table(const char *name, TRANSLOG_ADDRESS addr);
static void print_redo_phase_progress(TRANSLOG_ADDRESS addr);
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
static void delete_all_transactions();
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/** @brief global [out] buffer for translog_read_record(); never shrinks */
static struct
{
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
/*
uchar* is more adapted (less casts) than char*, thus we don't use
LEX_STRING.
*/
uchar *str;
size_t length;
} log_record_buffer;
static void enlarge_buffer(const TRANSLOG_HEADER_BUFFER *rec)
{
if (log_record_buffer.length < rec->record_length)
{
log_record_buffer.length= rec->record_length;
log_record_buffer.str= my_realloc(log_record_buffer.str,
rec->record_length,
MYF(MY_WME | MY_ALLOW_ZERO_PTR));
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
}
/** @brief Tells what kind of progress message was printed to the error log */
static enum recovery_message_type
{
REC_MSG_NONE= 0, REC_MSG_REDO, REC_MSG_UNDO, REC_MSG_FLUSH
} recovery_message_printed;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
/* Hook to ensure we get nicer output if we get an error */
int maria_recover_error_handler_hook(uint error, const char *str,
myf flags)
{
if (procent_printed)
{
procent_printed= 0;
fputc('\n', stderr);
fflush(stderr);
}
return (*save_error_handler_hook)(error, str, flags);
}
/* Define this if you want gdb to break in some interesting situations */
#define ALERT_USER()
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
WL#3071 Maria checkpoint, WL#3072 Maria recovery instead of fprintf(stderr) when a task (with no user connected) gets an error, use my_printf_error(). Flags ME_JUST_WARNING and ME_JUST_INFO added to my_error()/my_printf_error(), which pass it to my_message_sql() which is modified to call the appropriate sql_print_*(). This way recovery can signal its start and end with [Note] and not [ERROR] (but failure with [ERROR]). Recovery's detailed progress (percents etc) still uses stderr as they have to stay on one single line. sql_print_error() changed to use my_progname_short (nicer display). mysql-test-run.pl --gdb/--ddd does not run mysqld, because a breakpoint in mysql_parse is too late to debug startup problems; instead, dev should set the breakpoints it wants and then "run" ("r"). include/my_sys.h: new flags to tell error_handler_hook that this is not an error but an information or warning mysql-test/mysql-test-run.pl: when running with --gdb/--ddd to debug mysqld, breaking at mysql_parse is too late to debug startup problems; now, it does not run mysqld, does not set breakpoints, developer can set as early breakpoints as it wants and is responsible for typing "run" (or "r") mysys/my_init.c: set my_progname_short mysys/my_static.c: my_progname_short added sql/mysqld.cc: * my_message_sql() can now receive info or warning, not only error; this allows mysys to tell the user (or the error log if no user) about an info or warning. Used from Maria. * plugins (or engines like Maria) may want to call my_error(), so set up the error handler hook (my_message_sql) before initializing plugins; otherwise they get my_message_no_curses which is less integrated into mysqld (is just fputs()) * using my_progname_short instead of my_progname, in my_message_sql() (less space on screen) storage/maria/ma_checkpoint.c: fprintf(stderr) -> ma_message_no_user() storage/maria/ma_checkpoint.h: function for any Maria task, not connected to a user (example: checkpoint, recovery; soon could be deleted records purger) to report a message (calls my_printf_error() which, when inside ha_maria, leads to sql_print_*(), and when outside, leads to my_message_no_curses i.e. stderr). storage/maria/ma_recovery.c: To tell that recovery starts and ends we use ma_message_no_user() (sql_print_*() in practice). Detailed progress info still uses stderr as sql_print() cannot put several messages on one line. 071116 18:42:16 [Note] mysqld: Maria engine: starting recovery recovered pages: 0% 67% 100% (0.0 seconds); transactions to roll back: 1 0 (0.0 seconds); tables to flush: 1 0 (0.0 seconds); 071116 18:42:16 [Note] mysqld: Maria engine: recovery done storage/maria/maria_chk.c: my_progname_short moved to mysys storage/maria/maria_read_log.c: my_progname_short moved to mysys storage/myisam/myisamchk.c: my_progname_short moved to mysys
2007-11-16 17:09:51 +01:00
static void print_preamble()
{
ma_message_no_user(ME_JUST_INFO, "starting recovery");
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/**
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
@brief Recovers from the last checkpoint.
Runs the REDO phase using special structures, then sets up the playground
of runtime: recreates transactions inside trnman, open tables with their
two-byte-id mapping; takes a checkpoint and runs the UNDO phase. Closes all
tables.
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
@return Operation status
@retval 0 OK
@retval !=0 Error
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
*/
WL#4374 "Maria - force start if Recovery fails multiple times" http://forge.mysql.com/worklog/task.php?id=4374 new option --maria-force-start-after-recovery-failures=N; number of consecutive recovery failures (failures of log reading or recovery processing, anything in [translog_init(),maria_recovery_from_log()]) is stored in the control file; if at a Maria start they are more than N, logs are removed. This is for automated systems which have to run whatever happens. As tables risk staying corrupted, --maria-recover should also be used on them: this revision makes maria-recover work (it was disabled). Fixed bug in translog_is_log_files(). translog_init() now prints message to error log if failed. Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there. KNOWN_BUGS.txt: As option --maria-force-start-after-recovery-failures is added, it corresponds to the wish "we should fix that if this happens etc". LOAD INDEX is not ignored since a few weeks. Listed concurrency bugs have been fixed some time ago. Recovery of fulltext and GIS indexes works since a few weeks. mysql-test/include/maria_make_snapshot.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/include/maria_make_snapshot_for_comparison.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/include/maria_verify_recovery.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/lib/mtr_report.pl: new test maria-recover.test generates expected corruption warnings in the error log. maria-recovery.test's corrupted table is renamed to t_corrupted1 instead of t1. mysql-test/r/maria-preload.result: result update. maria_pagecache_read* values are similar to the previous version of this file, though a bit bigger because using the information_schema and the join leads to some internal maria temp table being used, and thus some blocks of it being read. mysql-test/r/maria-purge.result: engine's name in SHOW ENGINE MARIA LOGS changed. mysql-test/r/maria-recover.result: result for new test. We see corruption messages at first SELECT and then none at second SELECT, expected. mysql-test/r/maria-recovery.result: result update mysql-test/r/maria.result: new variables show up mysql-test/t/disabled.def: BUG#34911 is not fixed but the test had been made independent of the bug (workaround). A new bug (crash) has popped recently, so it has to stay disabled (BUG#35107). mysql-test/t/maria-preload.test: Work around BUG#34911 "FLUSH STATUS doesn't flush what it should": compute differences in status variables before and after relevant queries mysql-test/t/maria-recover-master.opt: test --maria-recover mysql-test/t/maria-recover.test: Test of the --maria-recover option (build a corrupted table and see if it is auto-repaired) mysql-test/t/maria-recovery-big.test: update for new API of include/maria*.inc mysql-test/t/maria-recovery-bitmap.test: update for new API of include/maria*.inc mysql-test/t/maria-recovery.test: update for new API of include/maria*.inc. Corrupted table t1 renamed to t_corrupted1, so that mtr_report.pl does not blindly remove all corruption messages for t1 which is a common name. storage/maria/ha_maria.cc: Enabling maria-recover. Adding option and global variable --maria_force_start_after_recovery_failures: ha_maria_init() calls mark_recovery_start() and mark_recovery_success() to keep track of failed consecutive recoveries and remove logs if needed. Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there. storage/maria/ma_checkpoint.c: new prototype storage/maria/ma_control_file.c: Storing in one byte in the control file, the number of consecutive recovery failures. storage/maria/ma_control_file.h: new prototype storage/maria/ma_init.c: new prototype storage/maria/ma_locking.c: Need to update open_count on disk at first write and close for transactional tables, like we already did for non-transactional tables, otherwise we cannot notice that the table is dubious. storage/maria/ma_loghandler.c: translog_is_log_files() is made more generic to serve either to search or to delete logs (the latter is for --maria-force-start-after-recovery-failures). It also had a bug (always returned FALSE). storage/maria/ma_loghandler.h: export function because ha_maria::mark_recovery_start() needs it storage/maria/ma_recovery.c: changing name of maria_recover() to distinguish from the maria-recover option. storage/maria/ma_recovery.h: changing name of maria_recover() to distinguish from the maria-recover option. storage/maria/ma_test_force_start.pl: Test of --maria-force-start-after-recovery-failures (and also, to be realistic, of --maria-recover). This is standalone because mysql-test-run does not support testing that multiple mysqld restarts expectedly failed. I'll have to run it on my machine and also on a Windows machine. storage/maria/unittest/ma_control_file-t.c: adding recovery_failures to the test storage/maria/unittest/ma_test_loghandler_multigroup-t.c: fix for compiler warning (unused variable in non-debug build)
2008-06-02 22:53:25 +02:00
int maria_recovery_from_log(void)
{
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
int res= 1;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
FILE *trace_file;
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
uint warnings_count;
Fixed compiler warning message - Added checking of return value for system(), freopen(), fgets() and chown() - Ensure that calls that require a format strings gets a format string - Other trivial things Updated test suite results (especially for pbxt and embedded server) Removed warning for "Invalid (old?) table or database name 'mysqld.1'" from pbxt tests Speed up some pbxt tests by inserting begin ; commit; around "while loops with inserts" Added mysqld startup option '--debug-flush' Create maria_recovery.trace in data directory instead of current directory client/mysql.cc: Check return value from system() client/mysql_upgrade.c: Check return value from fgets() client/mysqladmin.cc: Check return value from fgets() client/mysqlslap.c: Check return value from system() (but ignore it, as it's not critical) extra/yassl/src/crypto_wrapper.cpp: Check return value from fgets() (but ignore it, as it's internal file) extra/yassl/taocrypt/src/aes.cpp: Added extra {} to remove compiler warning extra/yassl/taocrypt/src/blowfish.cpp: Added extra {} to remove compiler warning extra/yassl/taocrypt/src/misc.cpp: Ifdef not used code include/mysys_err.h: Added error message for failing chown() mysql-test/mysql-test-run.pl: Don't give warning for skipping ndbcluster (never enabled in MariaDB) mysql-test/suite/funcs_1/r/is_columns_is_embedded.result: Update with new information schema information mysql-test/suite/funcs_1/r/is_tables_is_embedded.result: New test mysql-test/suite/funcs_1/r/is_tables_myisam_embedded.result: Update test results (has not been tested for a long time) mysql-test/suite/funcs_1/r/is_tables_mysql_embedded.result: Update test results (has not been tested for a long time) mysql-test/suite/funcs_1/t/is_tables_is.test: Don't run with embedded server (as results differ) I added a new test for embedded server mysql-test/suite/funcs_1/t/is_tables_is_embedded.test: New test mysql-test/suite/pbxt/my.cnf: Allow one to run pbxt tests without having to specify --mysqld=--default-storage-engine=pbxt mysql-test/suite/pbxt/t/count_distinct3.test: Speed up test by inserting begin; ... commit; mysql-test/suite/pbxt/t/subselect.test: Speed up test by inserting begin; ... commit; mysys/errors.c: Added error message for failing chown() mysys/my_copy.c: Added error message for failing chown() mysys/my_redel.c: Added error message for failing chown() mysys/safemalloc.c: Added cast to get rid of compiler warning sql/ha_partition.cc: Fixed wrong argument to sql_print_error() (it requires a format string) sql/log.cc: Test return value of freopen() sql/mysqld.cc: Test return value of freopen() Added startup option '--debug-flush' to be used when one gets a core dump (easy to explain to people on IRC) sql/rpl_rli.cc: Fixed wrong argument to sql_print_error() (it requires a format string) sql/set_var.cc: Added {} to get rid of compiler warnings sql/slave.cc: Fixed wrong argument to mi->report() and sql_print...() (they require a format string) sql/sql_cache.cc: Fixed wrong argument to sql_printinformation() (it requires a format string) sql/sql_parse.cc: Test return value of fgets() sql/sql_plugin.cc: Fixed wrong argument to sql_print_error() (it requires a format string) sql/sql_select.cc: Use unique table name for internal temp tables instead of full path (Simple speed & space optimization) sql/udf_example.c: Removed compiler warning about not used variable storage/maria/ha_maria.cc: Fixed wrong argument to sql_print_error() and ma_check_print_error() (they require a format string) storage/maria/ma_recovery.c: Create maria_recovery.trace in data directory instead of current directory storage/maria/unittest/ma_test_loghandler-t.c: Fixed wrong argument to ok(); Requires a format string storage/pbxt/src/strutil_xt.cc: Detect temporary tables by checking if that path for the table is in the mysql data directory. The database for temporary tables is after this patch, from PBXT point of view, "" This is needed to stop PBXT from calling filename_to_tablename() with the base directory as an argument, which caused ERROR: Invalid (old?) table or database name 'mysqld.1'" in the log when running the test suite. tests/mysql_client_test.c: Fixed compiler warnings unittest/mysys/base64-t.c: Fixed wrong argument to diag() (it requires a format string) Added a comment that the current 'print' of differing buffers doesn't print the right thing, but didn't fix this as it's not important (unless we find a bug in the real code)
2009-10-26 12:35:42 +01:00
#ifdef EXTRA_DEBUG
char name_buff[FN_REFLEN];
#endif
WL#4374 "Maria - force start if Recovery fails multiple times" http://forge.mysql.com/worklog/task.php?id=4374 new option --maria-force-start-after-recovery-failures=N; number of consecutive recovery failures (failures of log reading or recovery processing, anything in [translog_init(),maria_recovery_from_log()]) is stored in the control file; if at a Maria start they are more than N, logs are removed. This is for automated systems which have to run whatever happens. As tables risk staying corrupted, --maria-recover should also be used on them: this revision makes maria-recover work (it was disabled). Fixed bug in translog_is_log_files(). translog_init() now prints message to error log if failed. Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there. KNOWN_BUGS.txt: As option --maria-force-start-after-recovery-failures is added, it corresponds to the wish "we should fix that if this happens etc". LOAD INDEX is not ignored since a few weeks. Listed concurrency bugs have been fixed some time ago. Recovery of fulltext and GIS indexes works since a few weeks. mysql-test/include/maria_make_snapshot.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/include/maria_make_snapshot_for_comparison.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/include/maria_verify_recovery.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/lib/mtr_report.pl: new test maria-recover.test generates expected corruption warnings in the error log. maria-recovery.test's corrupted table is renamed to t_corrupted1 instead of t1. mysql-test/r/maria-preload.result: result update. maria_pagecache_read* values are similar to the previous version of this file, though a bit bigger because using the information_schema and the join leads to some internal maria temp table being used, and thus some blocks of it being read. mysql-test/r/maria-purge.result: engine's name in SHOW ENGINE MARIA LOGS changed. mysql-test/r/maria-recover.result: result for new test. We see corruption messages at first SELECT and then none at second SELECT, expected. mysql-test/r/maria-recovery.result: result update mysql-test/r/maria.result: new variables show up mysql-test/t/disabled.def: BUG#34911 is not fixed but the test had been made independent of the bug (workaround). A new bug (crash) has popped recently, so it has to stay disabled (BUG#35107). mysql-test/t/maria-preload.test: Work around BUG#34911 "FLUSH STATUS doesn't flush what it should": compute differences in status variables before and after relevant queries mysql-test/t/maria-recover-master.opt: test --maria-recover mysql-test/t/maria-recover.test: Test of the --maria-recover option (build a corrupted table and see if it is auto-repaired) mysql-test/t/maria-recovery-big.test: update for new API of include/maria*.inc mysql-test/t/maria-recovery-bitmap.test: update for new API of include/maria*.inc mysql-test/t/maria-recovery.test: update for new API of include/maria*.inc. Corrupted table t1 renamed to t_corrupted1, so that mtr_report.pl does not blindly remove all corruption messages for t1 which is a common name. storage/maria/ha_maria.cc: Enabling maria-recover. Adding option and global variable --maria_force_start_after_recovery_failures: ha_maria_init() calls mark_recovery_start() and mark_recovery_success() to keep track of failed consecutive recoveries and remove logs if needed. Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there. storage/maria/ma_checkpoint.c: new prototype storage/maria/ma_control_file.c: Storing in one byte in the control file, the number of consecutive recovery failures. storage/maria/ma_control_file.h: new prototype storage/maria/ma_init.c: new prototype storage/maria/ma_locking.c: Need to update open_count on disk at first write and close for transactional tables, like we already did for non-transactional tables, otherwise we cannot notice that the table is dubious. storage/maria/ma_loghandler.c: translog_is_log_files() is made more generic to serve either to search or to delete logs (the latter is for --maria-force-start-after-recovery-failures). It also had a bug (always returned FALSE). storage/maria/ma_loghandler.h: export function because ha_maria::mark_recovery_start() needs it storage/maria/ma_recovery.c: changing name of maria_recover() to distinguish from the maria-recover option. storage/maria/ma_recovery.h: changing name of maria_recover() to distinguish from the maria-recover option. storage/maria/ma_test_force_start.pl: Test of --maria-force-start-after-recovery-failures (and also, to be realistic, of --maria-recover). This is standalone because mysql-test-run does not support testing that multiple mysqld restarts expectedly failed. I'll have to run it on my machine and also on a Windows machine. storage/maria/unittest/ma_control_file-t.c: adding recovery_failures to the test storage/maria/unittest/ma_test_loghandler_multigroup-t.c: fix for compiler warning (unused variable in non-debug build)
2008-06-02 22:53:25 +02:00
DBUG_ENTER("maria_recovery_from_log");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
DBUG_ASSERT(!maria_in_recovery);
maria_in_recovery= TRUE;
#ifdef EXTRA_DEBUG
Fixed compiler warning message - Added checking of return value for system(), freopen(), fgets() and chown() - Ensure that calls that require a format strings gets a format string - Other trivial things Updated test suite results (especially for pbxt and embedded server) Removed warning for "Invalid (old?) table or database name 'mysqld.1'" from pbxt tests Speed up some pbxt tests by inserting begin ; commit; around "while loops with inserts" Added mysqld startup option '--debug-flush' Create maria_recovery.trace in data directory instead of current directory client/mysql.cc: Check return value from system() client/mysql_upgrade.c: Check return value from fgets() client/mysqladmin.cc: Check return value from fgets() client/mysqlslap.c: Check return value from system() (but ignore it, as it's not critical) extra/yassl/src/crypto_wrapper.cpp: Check return value from fgets() (but ignore it, as it's internal file) extra/yassl/taocrypt/src/aes.cpp: Added extra {} to remove compiler warning extra/yassl/taocrypt/src/blowfish.cpp: Added extra {} to remove compiler warning extra/yassl/taocrypt/src/misc.cpp: Ifdef not used code include/mysys_err.h: Added error message for failing chown() mysql-test/mysql-test-run.pl: Don't give warning for skipping ndbcluster (never enabled in MariaDB) mysql-test/suite/funcs_1/r/is_columns_is_embedded.result: Update with new information schema information mysql-test/suite/funcs_1/r/is_tables_is_embedded.result: New test mysql-test/suite/funcs_1/r/is_tables_myisam_embedded.result: Update test results (has not been tested for a long time) mysql-test/suite/funcs_1/r/is_tables_mysql_embedded.result: Update test results (has not been tested for a long time) mysql-test/suite/funcs_1/t/is_tables_is.test: Don't run with embedded server (as results differ) I added a new test for embedded server mysql-test/suite/funcs_1/t/is_tables_is_embedded.test: New test mysql-test/suite/pbxt/my.cnf: Allow one to run pbxt tests without having to specify --mysqld=--default-storage-engine=pbxt mysql-test/suite/pbxt/t/count_distinct3.test: Speed up test by inserting begin; ... commit; mysql-test/suite/pbxt/t/subselect.test: Speed up test by inserting begin; ... commit; mysys/errors.c: Added error message for failing chown() mysys/my_copy.c: Added error message for failing chown() mysys/my_redel.c: Added error message for failing chown() mysys/safemalloc.c: Added cast to get rid of compiler warning sql/ha_partition.cc: Fixed wrong argument to sql_print_error() (it requires a format string) sql/log.cc: Test return value of freopen() sql/mysqld.cc: Test return value of freopen() Added startup option '--debug-flush' to be used when one gets a core dump (easy to explain to people on IRC) sql/rpl_rli.cc: Fixed wrong argument to sql_print_error() (it requires a format string) sql/set_var.cc: Added {} to get rid of compiler warnings sql/slave.cc: Fixed wrong argument to mi->report() and sql_print...() (they require a format string) sql/sql_cache.cc: Fixed wrong argument to sql_printinformation() (it requires a format string) sql/sql_parse.cc: Test return value of fgets() sql/sql_plugin.cc: Fixed wrong argument to sql_print_error() (it requires a format string) sql/sql_select.cc: Use unique table name for internal temp tables instead of full path (Simple speed & space optimization) sql/udf_example.c: Removed compiler warning about not used variable storage/maria/ha_maria.cc: Fixed wrong argument to sql_print_error() and ma_check_print_error() (they require a format string) storage/maria/ma_recovery.c: Create maria_recovery.trace in data directory instead of current directory storage/maria/unittest/ma_test_loghandler-t.c: Fixed wrong argument to ok(); Requires a format string storage/pbxt/src/strutil_xt.cc: Detect temporary tables by checking if that path for the table is in the mysql data directory. The database for temporary tables is after this patch, from PBXT point of view, "" This is needed to stop PBXT from calling filename_to_tablename() with the base directory as an argument, which caused ERROR: Invalid (old?) table or database name 'mysqld.1'" in the log when running the test suite. tests/mysql_client_test.c: Fixed compiler warnings unittest/mysys/base64-t.c: Fixed wrong argument to diag() (it requires a format string) Added a comment that the current 'print' of differing buffers doesn't print the right thing, but didn't fix this as it's not important (unless we find a bug in the real code)
2009-10-26 12:35:42 +01:00
fn_format(name_buff, "maria_recovery.trace", maria_data_root, "", MYF(0));
trace_file= my_fopen(name_buff, O_WRONLY|O_APPEND|O_CREAT, MYF(MY_WME));
#else
trace_file= NULL; /* no trace file for being fast */
#endif
tprint(trace_file, "TRACE of the last MARIA recovery from mysqld\n");
DBUG_ASSERT(maria_pagecache->inited);
res= maria_apply_log(LSN_IMPOSSIBLE, LSN_IMPOSSIBLE, MARIA_LOG_APPLY,
trace_file, TRUE, TRUE, TRUE, &warnings_count);
if (!res)
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
{
Fix for LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery More DBUG_PRINT (to simplify future debugging) Aria: Added STATE_IN_REPAIR, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Aria: Some trivial speedup optimization Aria: Better warning if table was marked crashed by unfinnished repair mysql-test/lib/v1/mysql-test-run.pl: Fix so one can run RQG mysql-test/suite/maria/r/maria-recovery2.result: Update for new error message. mysys/stacktrace.c: Fixed compiler warning storage/maria/ha_maria.cc: More DBUG_PRINT Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Don't log query for dropping temporary table. storage/maria/ha_maria.h: Added prototype for drop_table() storage/maria/ma_blockrec.c: More DBUG_PRINT Make read_long_data() inline for most cases. (Trivial speedup optimization) storage/maria/ma_check.c: Better warning if table was marked crashed by unfinnished repair storage/maria/ma_open.c: More DBUG_PRINT storage/maria/ma_recovery.c: Give warning if found crashed table. Changed warning for tables that can't be opened. storage/maria/ma_recovery_util.c: Write warnings to DBUG file storage/maria/maria_chk.c: Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. storage/maria/maria_def.h: Added maria_mark_in_repair(x) storage/maria/maria_read_log.c: Added option: --character-sets-dir storage/maria/trnman.c: By default set min_read_from to max value. This allows us to remove TRN:s from rows during recovery to get more space. This fixes bug LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery
2010-07-30 09:45:27 +02:00
if (warnings_count == 0 && recovery_found_crashed_tables == 0)
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
tprint(trace_file, "SUCCESS\n");
else
tprint(trace_file, "DOUBTFUL (%u warnings, check previous output)\n",
warnings_count);
}
if (trace_file)
Fixed compiler warning message - Added checking of return value for system(), freopen(), fgets() and chown() - Ensure that calls that require a format strings gets a format string - Other trivial things Updated test suite results (especially for pbxt and embedded server) Removed warning for "Invalid (old?) table or database name 'mysqld.1'" from pbxt tests Speed up some pbxt tests by inserting begin ; commit; around "while loops with inserts" Added mysqld startup option '--debug-flush' Create maria_recovery.trace in data directory instead of current directory client/mysql.cc: Check return value from system() client/mysql_upgrade.c: Check return value from fgets() client/mysqladmin.cc: Check return value from fgets() client/mysqlslap.c: Check return value from system() (but ignore it, as it's not critical) extra/yassl/src/crypto_wrapper.cpp: Check return value from fgets() (but ignore it, as it's internal file) extra/yassl/taocrypt/src/aes.cpp: Added extra {} to remove compiler warning extra/yassl/taocrypt/src/blowfish.cpp: Added extra {} to remove compiler warning extra/yassl/taocrypt/src/misc.cpp: Ifdef not used code include/mysys_err.h: Added error message for failing chown() mysql-test/mysql-test-run.pl: Don't give warning for skipping ndbcluster (never enabled in MariaDB) mysql-test/suite/funcs_1/r/is_columns_is_embedded.result: Update with new information schema information mysql-test/suite/funcs_1/r/is_tables_is_embedded.result: New test mysql-test/suite/funcs_1/r/is_tables_myisam_embedded.result: Update test results (has not been tested for a long time) mysql-test/suite/funcs_1/r/is_tables_mysql_embedded.result: Update test results (has not been tested for a long time) mysql-test/suite/funcs_1/t/is_tables_is.test: Don't run with embedded server (as results differ) I added a new test for embedded server mysql-test/suite/funcs_1/t/is_tables_is_embedded.test: New test mysql-test/suite/pbxt/my.cnf: Allow one to run pbxt tests without having to specify --mysqld=--default-storage-engine=pbxt mysql-test/suite/pbxt/t/count_distinct3.test: Speed up test by inserting begin; ... commit; mysql-test/suite/pbxt/t/subselect.test: Speed up test by inserting begin; ... commit; mysys/errors.c: Added error message for failing chown() mysys/my_copy.c: Added error message for failing chown() mysys/my_redel.c: Added error message for failing chown() mysys/safemalloc.c: Added cast to get rid of compiler warning sql/ha_partition.cc: Fixed wrong argument to sql_print_error() (it requires a format string) sql/log.cc: Test return value of freopen() sql/mysqld.cc: Test return value of freopen() Added startup option '--debug-flush' to be used when one gets a core dump (easy to explain to people on IRC) sql/rpl_rli.cc: Fixed wrong argument to sql_print_error() (it requires a format string) sql/set_var.cc: Added {} to get rid of compiler warnings sql/slave.cc: Fixed wrong argument to mi->report() and sql_print...() (they require a format string) sql/sql_cache.cc: Fixed wrong argument to sql_printinformation() (it requires a format string) sql/sql_parse.cc: Test return value of fgets() sql/sql_plugin.cc: Fixed wrong argument to sql_print_error() (it requires a format string) sql/sql_select.cc: Use unique table name for internal temp tables instead of full path (Simple speed & space optimization) sql/udf_example.c: Removed compiler warning about not used variable storage/maria/ha_maria.cc: Fixed wrong argument to sql_print_error() and ma_check_print_error() (they require a format string) storage/maria/ma_recovery.c: Create maria_recovery.trace in data directory instead of current directory storage/maria/unittest/ma_test_loghandler-t.c: Fixed wrong argument to ok(); Requires a format string storage/pbxt/src/strutil_xt.cc: Detect temporary tables by checking if that path for the table is in the mysql data directory. The database for temporary tables is after this patch, from PBXT point of view, "" This is needed to stop PBXT from calling filename_to_tablename() with the base directory as an argument, which caused ERROR: Invalid (old?) table or database name 'mysqld.1'" in the log when running the test suite. tests/mysql_client_test.c: Fixed compiler warnings unittest/mysys/base64-t.c: Fixed wrong argument to diag() (it requires a format string) Added a comment that the current 'print' of differing buffers doesn't print the right thing, but didn't fix this as it's not important (unless we find a bug in the real code)
2009-10-26 12:35:42 +01:00
my_fclose(trace_file, MYF(0));
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
maria_in_recovery= FALSE;
DBUG_RETURN(res);
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/**
@brief Displays and/or applies the log
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
@param from_lsn LSN from which log reading/applying should start;
LSN_IMPOSSIBLE means "use last checkpoint"
@param end_lsn Apply until this. LSN_IMPOSSIBLE means until end.
@param apply how log records should be applied or not
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
@param trace_file trace file where progress/debug messages will go
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
@param skip_DDLs_arg Should DDL records (CREATE/RENAME/DROP/REPAIR)
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
be skipped by the REDO phase or not
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
@param take_checkpoints Should we take checkpoints or not.
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
@param[out] warnings_count Count of warnings will be put there
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
@todo This trace_file thing is primitive; soon we will make it similar to
ma_check_print_warning() etc, and a successful recovery does not need to
create a trace file. But for debugging now it is useful.
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
@return Operation status
@retval 0 OK
@retval !=0 Error
*/
int maria_apply_log(LSN from_lsn, LSN end_lsn,
enum maria_apply_log_way apply,
FILE *trace_file,
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
my_bool should_run_undo_phase, my_bool skip_DDLs_arg,
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
my_bool take_checkpoints, uint *warnings_count)
{
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
int error= 0;
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
uint uncommitted_trans;
ulonglong old_now;
my_bool abort_message_printed= 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
DBUG_ENTER("maria_apply_log");
DBUG_ASSERT(apply == MARIA_LOG_APPLY || !should_run_undo_phase);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
DBUG_ASSERT(!maria_multi_threaded);
Fix for LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery More DBUG_PRINT (to simplify future debugging) Aria: Added STATE_IN_REPAIR, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Aria: Some trivial speedup optimization Aria: Better warning if table was marked crashed by unfinnished repair mysql-test/lib/v1/mysql-test-run.pl: Fix so one can run RQG mysql-test/suite/maria/r/maria-recovery2.result: Update for new error message. mysys/stacktrace.c: Fixed compiler warning storage/maria/ha_maria.cc: More DBUG_PRINT Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Don't log query for dropping temporary table. storage/maria/ha_maria.h: Added prototype for drop_table() storage/maria/ma_blockrec.c: More DBUG_PRINT Make read_long_data() inline for most cases. (Trivial speedup optimization) storage/maria/ma_check.c: Better warning if table was marked crashed by unfinnished repair storage/maria/ma_open.c: More DBUG_PRINT storage/maria/ma_recovery.c: Give warning if found crashed table. Changed warning for tables that can't be opened. storage/maria/ma_recovery_util.c: Write warnings to DBUG file storage/maria/maria_chk.c: Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. storage/maria/maria_def.h: Added maria_mark_in_repair(x) storage/maria/maria_read_log.c: Added option: --character-sets-dir storage/maria/trnman.c: By default set min_read_from to max value. This allows us to remove TRN:s from rows during recovery to get more space. This fixes bug LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery
2010-07-30 09:45:27 +02:00
recovery_warnings= recovery_found_crashed_tables= 0;
Fixed some bugs in the Maria storage engine - Changed default recovery mode from OFF to NORMAL to get automatic repair of not properly closed tables. - Fixed a rase condition when two threads calls external_lock and thr_lock() in different order. When this happend the transaction that called external lock first and thr_lock() last did not see see the rows from the other transaction, even if if it had to wait in thr_lock() for other to complete. - Fixed that one can run maria_chk on an automatcally recovered tables without warnings about too small transaction id - Don't give warning that crashed table could not be repaired if repair was disabled (and thus not run) - Fixed a error result from flush_key_cache() which caused a DBUG_ASSERT() when one was using concurrent reads on non transactional tables that was updated. client/mysqldump.c: Add "" around error message to make it more readable client/mysqltest.cc: Free environment variables mysql-test/r/mysqldump.result: Updated results mysql-test/r/openssl_1.result: Updated results mysql-test/suite/maria/r/maria-recover.result: Updated results mysql-test/suite/maria/r/maria3.result: Updated results mysql-test/suite/maria/t/maria3.test: Added more test of temporary tables storage/maria/ha_maria.cc: Changed default recovery mode from OFF to NORMAL to get automatic repair of not properly closed tables. Start transaction in ma_block_get_status() instead of in ha_maria::external_lock(). - This fixes a rase condition when two threads calls external lock and thr_lock() in different order. When this happend the transaction that called external lock first and thr_lock() last did not see see the rows from the other transaction, even if if it had to wait in thr_lock() for other to complete. Store latest transaction id in controll file if recovery was done. - This allows one to run maria_chk on an automatcally recovered tables without warnings about too small transaction id storage/maria/ha_maria.h: Don't give warning that crashed table could not be repaired if repair was disabled (and thus not run) storage/maria/ma_blockrec.h: Added new function "_ma_block_get_status_no_versioning()" storage/maria/ma_init.c: Added hook to create trn in ma_block_get_status() if we are using MariaDB storage/maria/ma_open.c: Ensure we call _ma_block_get_status_no_versioning() for transactional tables without versioning (like tables with fulltext) storage/maria/ma_pagecache.c: Allow one to flush blocks that are pinned for read. This fixed a error result from flush_key_cache() which caused a DBUG_ASSERT() when one was using concurrent reads on non transactional tables that was updated. storage/maria/ma_recovery.c: Set maria_recovery_changed_data to 1 if recover changed something. Set max_trid_in_control_file to max found trn if we found a bigger trn. The allows will ensure that the control file is up to date after recovery which allows one to run maria_chk on the tables without warnings about too big trn storage/maria/ma_state.c: Call maria_create_trn_hook() in _ma_setup_live_state() instead of ha_maria::external_lock() This ensures that 'state' and trn are in sync and thus fixes the race condition mentioned for ha_maria.cc storage/maria/ma_static.c: Added maria_create_trn_hook() and maria_recovery_changed_data storage/maria/maria_def.h: Added MARIA_HANDLER->external_ptr, which is used to hold MariaDB thd. Added some new external variables Removed reference to non existing function: maria_concurrent_inserts()
2010-06-14 00:13:32 +02:00
maria_recovery_changed_data= 0;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
/* checkpoints can happen only if TRNs have been built */
DBUG_ASSERT(should_run_undo_phase || !take_checkpoints);
DBUG_ASSERT(end_lsn == LSN_IMPOSSIBLE || should_run_undo_phase == 0);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_active_trans= (struct st_trn_for_recovery *)
my_malloc((SHORT_TRID_MAX + 1) * sizeof(struct st_trn_for_recovery),
MYF(MY_ZEROFILL));
all_tables= (struct st_table_for_recovery *)
my_malloc((SHARE_ID_MAX + 1) * sizeof(struct st_table_for_recovery),
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
MYF(MY_ZEROFILL));
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
save_error_handler_hook= error_handler_hook;
error_handler_hook= maria_recover_error_handler_hook;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (!all_active_trans || !all_tables)
goto err;
WL#3071 - Maria checkpoint - serializing calls to flush_pagecache_blocks_int() on the same file to avoid known concurrency bugs - having that, we can now enable the background thread, as the flushes it does are now supposedly safe in concurrent situations. - new type of flush FLUSH_KEEP_LAZY: when the background checkpoint thread is flushing a packet of dirty pages between two checkpoints, it uses this flush type, indeed if a file is already being flushed by another thread it's smarter to move on to the next file than wait. - maria_checkpoint_frequency renamed to maria_checkpoint_interval. include/my_sys.h: new type of flushing for the page cache: FLUSH_KEEP_LAZY mysql-test/r/maria.result: result update mysys/mf_keycache.c: indentation. No FLUSH_KEEP_LAZY support in key cache. storage/maria/ha_maria.cc: maria_checkpoint_frequency was somehow a hidden part of the Checkpoint API and that was not good. Now we have checkpoint_interval, local to ha_maria.cc, which serves as container for the user-visible maria_checkpoint_interval global variable; setting it calls update_checkpoint_interval which passes the new value to ma_checkpoint_init(). There is no hiding anymore. By default, enable background thread which does checkpoints every 30 seconds, and dirty page flush in between. That thread takes a checkpoint when it ends, so no need for maria_hton_panic to take one. The | is | and not ||, because maria_panic() must always be called. frequency->interval. storage/maria/ma_checkpoint.c: Use FLUSH_KEEP_LAZY for background thread when it flushes packets of dirty pages between two checkpoints: it is smarter to move on to the next file than wait for it to have been completely flushed, which may take long. Comments about flush concurrency bugs moved from ma_pagecache.c. Removing out-of-date comment. frequency->interval. create_background_thread -> (interval>0). In ma_checkpoint_background(), some variables need to be preserved between iterations. storage/maria/ma_checkpoint.h: new prototype storage/maria/ma_pagecache.c: - concurrent calls of flush_pagecache_blocks_int() on the same file cause bugs (see @note in that function); we fix them by serializing in this situation. For that we use a global hash of (file, wqueue). When flush_pagecache_blocks_int() starts it looks into the hash, using the file as key. If not found, it inserts (file,wqueue) into the hash, flushes the file, and finally removes itself from the hash and wakes up any waiter in the queue. If found, it adds itself to the wqueue and waits. - As a by-product, we can remove changed_blocks_is_incomplete and replace it by scanning the hash, replace the sleep() by a queue wait. - new type of flush FLUSH_KEEP_LAZY: when flushing a file, if it's already being flushed by another thread (even partially), return immediately. storage/maria/ma_pagecache.h: In pagecache, a hash of files currently being flushed (i.e. there is a call to flush_pagecache_blocks_int() for them). storage/maria/ma_recovery.c: new prototype storage/maria/ma_test1.c: new prototype storage/maria/ma_test2.c: new prototype
2007-10-19 14:15:13 +02:00
if (take_checkpoints && ma_checkpoint_init(0))
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
goto err;
recovery_message_printed= REC_MSG_NONE;
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
checkpoint_useful= trns_created= FALSE;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
tracef= trace_file;
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
#ifdef INSTANT_FLUSH_OF_MESSAGES
/* enable this for instant flush of messages to trace file */
setbuf(tracef, NULL);
#endif
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
skip_DDLs= skip_DDLs_arg;
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
skipped_undo_phase= 0;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (from_lsn == LSN_IMPOSSIBLE)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (last_checkpoint_lsn == LSN_IMPOSSIBLE)
{
from_lsn= translog_first_lsn_in_log();
if (unlikely(from_lsn == LSN_ERROR))
goto err;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
else
{
from_lsn= parse_checkpoint_record(last_checkpoint_lsn);
if (from_lsn == LSN_ERROR)
goto err;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
now= my_getsystime();
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
in_redo_phase= TRUE;
When one does a drop table, the indexes are not flushed to disk before drop anymore (with MyISAM/Maria) myisam-recover options changed from OFF to 'DEFAULT' to get less change of data loss when using MyISAM. (The disadvantage is that changed MyISAM tables will be checked at access time; Use --myisam-recover=OFF for old behavior) Don't call extra(HA_EXTRA_FORCE_REOPEN) in ALTER TABLE if table is locked as this will mark table as crashed! Added assert to detect if we accidently would use MyISAM versioning in MySQL include/my_base.h: Mark NOT_USED as USED, as we now use this as a flag to not call extra() mysql-test/mysql-test-run.pl: Don't write all options when there is something wrong with the arguments mysql-test/r/sp-destruct.result: Add missing flush of mysql.proc (as the test copied live tables) mysql-test/r/variables.result: myisam-recover options changed to 'default' mysql-test/r/view.result: Don't show create time in result mysql-test/suite/maria/t/maria-recovery2-master.opt: Don't run test with myisam-recover (as this produces extra warnings during simulated death) mysql-test/t/sp-destruct.test: Add missing flush of mysql.proc (as the test copied live tables) mysql-test/t/view.test: Don't show create time in result sql/lock.cc: Added marker if table was deleted to argument list sql/mysql_priv.h: Added marker if table was deleted to argument list sql/mysqld.cc: myisam-recover options changed from OFF to 'DEFAULT' to get less change of data loss when using MyISAM Allow one to specify OFF as argument to myisam-recover (was default before but one couldn't specify it) sql/sql_base.cc: Mark if table is going to be deleted sql/sql_delete.cc: Mark if table is going to be deleted sql/sql_table.cc: Mark if table is going to be deleted Don't call extra(HA_EXTRA_FORCE_REOPEN) in ALTER TABLE if table is locked as this will mark table as crashed! sql/table.cc: Signal to handler if table is getting deleted as part of getting droped from table cache. sql/table.h: Added marker if table is going to be deleted. storage/maria/ha_maria.cc: Don't search for transaction handler if file is not transactional or outside of transaction (Fixed possible core dump) storage/maria/ma_blockrec.c: Don't write changed information if table is going to be deleted. storage/maria/ma_close.c: Don't write changed information if table is going to be deleted. storage/maria/ma_extra.c: Mark tables that are deleted as crased, to ensure good behavior on restart if we suddenly crash. storage/maria/ma_locking.c: Cleanup storage/maria/ma_recovery.c: We need trnman to be inited during redo phase (to be able to open tables checked with maria_chk) storage/maria/maria_def.h: Added marker if table is going to be deleted. storage/myisam/mi_close.c: Don't write changed information if table is going to be deleted. storage/myisam/mi_extra.c: Mark tables that are deleted as crased, to ensure good behavior on restart if we suddenly crash. storage/myisam/mi_open.c: Added assert to detect if we accidently would use MyISAM versioning in MySQL storage/myisam/myisamdef.h: Added marker if table is going to be deleted.
2010-02-10 20:06:24 +01:00
trnman_init(max_trid_in_control_file);
if (run_redo_phase(from_lsn, end_lsn, apply))
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
{
ma_message_no_user(0, "Redo phase failed");
When one does a drop table, the indexes are not flushed to disk before drop anymore (with MyISAM/Maria) myisam-recover options changed from OFF to 'DEFAULT' to get less change of data loss when using MyISAM. (The disadvantage is that changed MyISAM tables will be checked at access time; Use --myisam-recover=OFF for old behavior) Don't call extra(HA_EXTRA_FORCE_REOPEN) in ALTER TABLE if table is locked as this will mark table as crashed! Added assert to detect if we accidently would use MyISAM versioning in MySQL include/my_base.h: Mark NOT_USED as USED, as we now use this as a flag to not call extra() mysql-test/mysql-test-run.pl: Don't write all options when there is something wrong with the arguments mysql-test/r/sp-destruct.result: Add missing flush of mysql.proc (as the test copied live tables) mysql-test/r/variables.result: myisam-recover options changed to 'default' mysql-test/r/view.result: Don't show create time in result mysql-test/suite/maria/t/maria-recovery2-master.opt: Don't run test with myisam-recover (as this produces extra warnings during simulated death) mysql-test/t/sp-destruct.test: Add missing flush of mysql.proc (as the test copied live tables) mysql-test/t/view.test: Don't show create time in result sql/lock.cc: Added marker if table was deleted to argument list sql/mysql_priv.h: Added marker if table was deleted to argument list sql/mysqld.cc: myisam-recover options changed from OFF to 'DEFAULT' to get less change of data loss when using MyISAM Allow one to specify OFF as argument to myisam-recover (was default before but one couldn't specify it) sql/sql_base.cc: Mark if table is going to be deleted sql/sql_delete.cc: Mark if table is going to be deleted sql/sql_table.cc: Mark if table is going to be deleted Don't call extra(HA_EXTRA_FORCE_REOPEN) in ALTER TABLE if table is locked as this will mark table as crashed! sql/table.cc: Signal to handler if table is getting deleted as part of getting droped from table cache. sql/table.h: Added marker if table is going to be deleted. storage/maria/ha_maria.cc: Don't search for transaction handler if file is not transactional or outside of transaction (Fixed possible core dump) storage/maria/ma_blockrec.c: Don't write changed information if table is going to be deleted. storage/maria/ma_close.c: Don't write changed information if table is going to be deleted. storage/maria/ma_extra.c: Mark tables that are deleted as crased, to ensure good behavior on restart if we suddenly crash. storage/maria/ma_locking.c: Cleanup storage/maria/ma_recovery.c: We need trnman to be inited during redo phase (to be able to open tables checked with maria_chk) storage/maria/maria_def.h: Added marker if table is going to be deleted. storage/myisam/mi_close.c: Don't write changed information if table is going to be deleted. storage/myisam/mi_extra.c: Mark tables that are deleted as crased, to ensure good behavior on restart if we suddenly crash. storage/myisam/mi_open.c: Added assert to detect if we accidently would use MyISAM versioning in MySQL storage/myisam/myisamdef.h: Added marker if table is going to be deleted.
2010-02-10 20:06:24 +01:00
trnman_destroy();
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto err;
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
}
When one does a drop table, the indexes are not flushed to disk before drop anymore (with MyISAM/Maria) myisam-recover options changed from OFF to 'DEFAULT' to get less change of data loss when using MyISAM. (The disadvantage is that changed MyISAM tables will be checked at access time; Use --myisam-recover=OFF for old behavior) Don't call extra(HA_EXTRA_FORCE_REOPEN) in ALTER TABLE if table is locked as this will mark table as crashed! Added assert to detect if we accidently would use MyISAM versioning in MySQL include/my_base.h: Mark NOT_USED as USED, as we now use this as a flag to not call extra() mysql-test/mysql-test-run.pl: Don't write all options when there is something wrong with the arguments mysql-test/r/sp-destruct.result: Add missing flush of mysql.proc (as the test copied live tables) mysql-test/r/variables.result: myisam-recover options changed to 'default' mysql-test/r/view.result: Don't show create time in result mysql-test/suite/maria/t/maria-recovery2-master.opt: Don't run test with myisam-recover (as this produces extra warnings during simulated death) mysql-test/t/sp-destruct.test: Add missing flush of mysql.proc (as the test copied live tables) mysql-test/t/view.test: Don't show create time in result sql/lock.cc: Added marker if table was deleted to argument list sql/mysql_priv.h: Added marker if table was deleted to argument list sql/mysqld.cc: myisam-recover options changed from OFF to 'DEFAULT' to get less change of data loss when using MyISAM Allow one to specify OFF as argument to myisam-recover (was default before but one couldn't specify it) sql/sql_base.cc: Mark if table is going to be deleted sql/sql_delete.cc: Mark if table is going to be deleted sql/sql_table.cc: Mark if table is going to be deleted Don't call extra(HA_EXTRA_FORCE_REOPEN) in ALTER TABLE if table is locked as this will mark table as crashed! sql/table.cc: Signal to handler if table is getting deleted as part of getting droped from table cache. sql/table.h: Added marker if table is going to be deleted. storage/maria/ha_maria.cc: Don't search for transaction handler if file is not transactional or outside of transaction (Fixed possible core dump) storage/maria/ma_blockrec.c: Don't write changed information if table is going to be deleted. storage/maria/ma_close.c: Don't write changed information if table is going to be deleted. storage/maria/ma_extra.c: Mark tables that are deleted as crased, to ensure good behavior on restart if we suddenly crash. storage/maria/ma_locking.c: Cleanup storage/maria/ma_recovery.c: We need trnman to be inited during redo phase (to be able to open tables checked with maria_chk) storage/maria/maria_def.h: Added marker if table is going to be deleted. storage/myisam/mi_close.c: Don't write changed information if table is going to be deleted. storage/myisam/mi_extra.c: Mark tables that are deleted as crased, to ensure good behavior on restart if we suddenly crash. storage/myisam/mi_open.c: Added assert to detect if we accidently would use MyISAM versioning in MySQL storage/myisam/myisamdef.h: Added marker if table is going to be deleted.
2010-02-10 20:06:24 +01:00
trnman_destroy();
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (end_lsn != LSN_IMPOSSIBLE)
{
abort_message_printed= 1;
my_message(HA_ERR_INITIALIZATION,
"Maria recovery aborted as end_lsn/end of file was reached",
MYF(0));
goto err2;
}
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
if ((uncommitted_trans=
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
end_of_redo_phase(should_run_undo_phase)) == (uint)-1)
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
{
ma_message_no_user(0, "End of redo phase failed");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
goto err;
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
}
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
in_redo_phase= FALSE;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
old_now= now;
now= my_getsystime();
if (recovery_message_printed == REC_MSG_REDO)
{
Added --with-maria-tmp-tables (default one) to allow on to configure if Maria should be used for internal temporary tables Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables Fixed bug that caused update of big blobs to crash Use pagecache_page_no_t as type for pages (to get rid of compiler warnings) Added cast to get rid of compiler warning Fixed wrong types of variables and arguments that caused lost information Fixed wrong DBUG_ASSERT() that caused REDO of big blobs to fail Removed some historical ifdefs that caused problem with windows compilations BUILD/SETUP.sh: Added --with-maria-tmp-tables include/maria.h: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option include/my_base.h: Added comment mysql-test/r/maria-big.result: Added test that uses big blobs mysql-test/t/maria-big.test: Added test that uses big blobs sql/mysqld.cc: Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables sql/sql_class.h: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined sql/sql_select.cc: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined storage/maria/ha_maria.cc: Fixed compiler warnings reported by MCC - Fixed usage of wrong types that caused data loss - Changed parameter for rep_quick to my_bool - Added safe casts Fixed indentation storage/maria/ma_bitmap.c: Use pagecache_page_no_t as type for pages Fixed compiler warnings Fixed bug that caused update of big blobs to crash storage/maria/ma_blockrec.c: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option Fixed compiler warnings Fixed wrong DBUG_ASSERT() storage/maria/ma_blockrec.h: Use pagecache_page_no_t as type for pages storage/maria/ma_check.c: Fixed some wrong parameters where we didn't get all bits for test_flag Changed rep_quick to be of type my_bool Use pagecache_page_no_t as type for pages Added cast's to get rid of compiler warnings Changed type of record_pos to get rid of compiler warning storage/maria/ma_create.c: Added safe cast's to get rid of compiler warnings storage/maria/ma_dynrec.c: Fixed usage of wrong type storage/maria/ma_key.c: Fixed compiler warning storage/maria/ma_key_recover.c: Use pagecache_page_no_t as type for pages storage/maria/ma_loghandler_lsn.h: Added cast's to get rid of compiler warnings storage/maria/ma_page.c: Changed variable name from 'page' to 'pos' as it was an offset and not a page address Moved page_size inside block to get rid of compiler warning storage/maria/ma_pagecache.c: Fixed compiler warnings Replaced compile time assert with TODO storage/maria/ma_pagecache.h: Use pagecache_page_no_t as type for pages storage/maria/ma_pagecrc.c: Allow bitmap pages that is all zero storage/maria/ma_preload.c: Added cast to get rid of compiler warning storage/maria/ma_recovery.c: Changed types to get rid of compiler warnings Use bool for quick_repair to get rid of compiler warning Fixed some variables that was wrongly declared (not enough precission) Added cast to get rid of compiler warning storage/maria/ma_test2.c: Remove historical undefs storage/maria/maria_chk.c: Changed rep_quick to bool Fixed wrong parameter to maria_chk_data_link() storage/maria/maria_def.h: Use pagecache_page_no_t as type for pages storage/maria/maria_pack.c: Renamed isam -> maria storage/maria/plug.in: Added option --with-maria-tmp-tables storage/maria/trnman.c: Added cast to get rid of compiler warning storage/myisam/mi_test2.c: Remove historical undefs
2008-01-10 20:21:36 +01:00
double phase_took= (now - old_now)/10000000.0;
WL#3071 Maria checkpoint, WL#3072 Maria recovery instead of fprintf(stderr) when a task (with no user connected) gets an error, use my_printf_error(). Flags ME_JUST_WARNING and ME_JUST_INFO added to my_error()/my_printf_error(), which pass it to my_message_sql() which is modified to call the appropriate sql_print_*(). This way recovery can signal its start and end with [Note] and not [ERROR] (but failure with [ERROR]). Recovery's detailed progress (percents etc) still uses stderr as they have to stay on one single line. sql_print_error() changed to use my_progname_short (nicer display). mysql-test-run.pl --gdb/--ddd does not run mysqld, because a breakpoint in mysql_parse is too late to debug startup problems; instead, dev should set the breakpoints it wants and then "run" ("r"). include/my_sys.h: new flags to tell error_handler_hook that this is not an error but an information or warning mysql-test/mysql-test-run.pl: when running with --gdb/--ddd to debug mysqld, breaking at mysql_parse is too late to debug startup problems; now, it does not run mysqld, does not set breakpoints, developer can set as early breakpoints as it wants and is responsible for typing "run" (or "r") mysys/my_init.c: set my_progname_short mysys/my_static.c: my_progname_short added sql/mysqld.cc: * my_message_sql() can now receive info or warning, not only error; this allows mysys to tell the user (or the error log if no user) about an info or warning. Used from Maria. * plugins (or engines like Maria) may want to call my_error(), so set up the error handler hook (my_message_sql) before initializing plugins; otherwise they get my_message_no_curses which is less integrated into mysqld (is just fputs()) * using my_progname_short instead of my_progname, in my_message_sql() (less space on screen) storage/maria/ma_checkpoint.c: fprintf(stderr) -> ma_message_no_user() storage/maria/ma_checkpoint.h: function for any Maria task, not connected to a user (example: checkpoint, recovery; soon could be deleted records purger) to report a message (calls my_printf_error() which, when inside ha_maria, leads to sql_print_*(), and when outside, leads to my_message_no_curses i.e. stderr). storage/maria/ma_recovery.c: To tell that recovery starts and ends we use ma_message_no_user() (sql_print_*() in practice). Detailed progress info still uses stderr as sql_print() cannot put several messages on one line. 071116 18:42:16 [Note] mysqld: Maria engine: starting recovery recovered pages: 0% 67% 100% (0.0 seconds); transactions to roll back: 1 0 (0.0 seconds); tables to flush: 1 0 (0.0 seconds); 071116 18:42:16 [Note] mysqld: Maria engine: recovery done storage/maria/maria_chk.c: my_progname_short moved to mysys storage/maria/maria_read_log.c: my_progname_short moved to mysys storage/myisam/myisamchk.c: my_progname_short moved to mysys
2007-11-16 17:09:51 +01:00
/*
Detailed progress info goes to stderr, because ma_message_no_user()
cannot put several messages on one line.
*/
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 1;
fprintf(stderr, " (%.1f seconds); ", phase_took);
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
fflush(stderr);
}
/**
REDO phase does not fill blocks' rec_lsn, so a checkpoint now would be
wrong: if a future recovery used it, the REDO phase would always
start from the checkpoint and never from before, wrongly skipping REDOs
WL#3072 - Maria recovery. * fix for bitmap vs checkpoint bug which could lead to corrupted tables in case of crashes at certain moments: a bitmap could be flushed to disk even though it was inconsistent with the log (it could be flushed before REDO-UNDO are written to the log). One bug remains, need code from others. Tests added. Fix is to pin unflushable bitmap pages, and let checkpoint wait for them to be flushable. * fix for long_trid!=0 assertion failure at Recovery. * less useless wakeups in the background flush|checkpoint thread. * store global_trid_generator in checkpoint record. mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery.test: make it easier to locate subtests storage/maria/ma_bitmap.c: When we send a bitmap to the pagecache, if this bitmap is not in a flushable state we keep it pinned and add it to a list, it will be unpinned when the bitmap is flushable again. A new function _ma_bitmap_flush_all() used by checkpoint. A new function _ma_bitmap_flushable() used by block format to signal when it starts modifying a bitmap and when it is done with it. storage/maria/ma_blockrec.c: When starting a row operation (insert/update/delete), mark that the bitmap is not flushable (because for example INSERT is going to over-allocate in the bitmap to prevent other threads from using our data pages). If a checkpoint comes at this moment it will wait for the bitmap to be flushable before flushing it. When the operation ends, bitmap becomes flushable again; that transition is done under the bitmap's mutex (needed for correct synchro with a concurrent checkpoint); but for INSERT/UPDATE this happens inside _ma_bitmap_release_unused() at a place where it already has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO of INSERT. In case of errors after setting the bitmap unflushable, we must always set it back to flushable or checkpoint would block. Debug possibilities to force a sleep while the bitmap is over-allocated. In case of error in get_head_or_tail() in allocate_and_write_block_record(), we still need to unpin all pages. Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong data_file_length. storage/maria/ma_blockrec.h: new bitmap calls. storage/maria/ma_checkpoint.c: filter_flush_indirect not needed anymore (flushing bitmap pages happens in _ma_bitmap_flush_all() now). So st_filter_param::is_data_file|pages_covered_by_bitmap not needed. Other filter_flush* don't need to flush bitmap anymore. Add debug possibility to flush all bitmap pages outside of a checkpoint, to simulate pagecache LRU eviction. When the background flush/checkpoint thread notices it has nothing to flush, it now sleeps directly until the next potential checkpoint moment instead of waking up every second. When in checkpoint we decide to not store a table in the checkpoint record (because it has logged no writes for example), we can also skip flushing this table. storage/maria/ma_commit.c: comment is out-of-date storage/maria/ma_key_recover.c: comment fix storage/maria/ma_loghandler.c: comment is out-of-date storage/maria/ma_open.c: comment is out-of-date storage/maria/ma_pagecache.c: comment for bug to fix. And we don't take checkpoints at end of REDO phase yet so can trust block->type. storage/maria/ma_recovery.c: Comments. Now-unneeded code for incomplete REDO-UNDO groups removed. When we forget about an old transaction we must really forget about it with bzero() (fixes the "long_trid!=0 assertion" recovery bug). When we delete a row with maria_delete() we turn on STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both cases a row was deleted). Pick up max_long_trid from the checkpoint record. storage/maria/maria_chk.c: comment storage/maria/maria_def.h: MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and 'pinned_pages'. storage/maria/trnman.c: I used to think that recovery only needs to know the maximum TrID of the lists of active and committed transactions. But no, sometimes both lists can even be empty and their TrID should not be reused. So Checkpoint now saves global_trid_generator in the checkpoint record. storage/maria/trnman_public.h: macros to read/store a TrID mysql-test/r/maria-recovery-bitmap.result: result is ok. Without the code fix, we would get a corruption message about the bitmap page in CHECK TABLE EXTENDED. mysql-test/t/maria-recovery-bitmap-master.opt: usual when we crash mysqld in tests mysql-test/t/maria-recovery-bitmap.test: test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
(tested). Another problem is that the REDO phase uses
PAGECACHE_PLAIN_PAGE, while Checkpoint only collects PAGECACHE_LSN_PAGE.
WL#3072 - Maria recovery. * fix for bitmap vs checkpoint bug which could lead to corrupted tables in case of crashes at certain moments: a bitmap could be flushed to disk even though it was inconsistent with the log (it could be flushed before REDO-UNDO are written to the log). One bug remains, need code from others. Tests added. Fix is to pin unflushable bitmap pages, and let checkpoint wait for them to be flushable. * fix for long_trid!=0 assertion failure at Recovery. * less useless wakeups in the background flush|checkpoint thread. * store global_trid_generator in checkpoint record. mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery.test: make it easier to locate subtests storage/maria/ma_bitmap.c: When we send a bitmap to the pagecache, if this bitmap is not in a flushable state we keep it pinned and add it to a list, it will be unpinned when the bitmap is flushable again. A new function _ma_bitmap_flush_all() used by checkpoint. A new function _ma_bitmap_flushable() used by block format to signal when it starts modifying a bitmap and when it is done with it. storage/maria/ma_blockrec.c: When starting a row operation (insert/update/delete), mark that the bitmap is not flushable (because for example INSERT is going to over-allocate in the bitmap to prevent other threads from using our data pages). If a checkpoint comes at this moment it will wait for the bitmap to be flushable before flushing it. When the operation ends, bitmap becomes flushable again; that transition is done under the bitmap's mutex (needed for correct synchro with a concurrent checkpoint); but for INSERT/UPDATE this happens inside _ma_bitmap_release_unused() at a place where it already has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO of INSERT. In case of errors after setting the bitmap unflushable, we must always set it back to flushable or checkpoint would block. Debug possibilities to force a sleep while the bitmap is over-allocated. In case of error in get_head_or_tail() in allocate_and_write_block_record(), we still need to unpin all pages. Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong data_file_length. storage/maria/ma_blockrec.h: new bitmap calls. storage/maria/ma_checkpoint.c: filter_flush_indirect not needed anymore (flushing bitmap pages happens in _ma_bitmap_flush_all() now). So st_filter_param::is_data_file|pages_covered_by_bitmap not needed. Other filter_flush* don't need to flush bitmap anymore. Add debug possibility to flush all bitmap pages outside of a checkpoint, to simulate pagecache LRU eviction. When the background flush/checkpoint thread notices it has nothing to flush, it now sleeps directly until the next potential checkpoint moment instead of waking up every second. When in checkpoint we decide to not store a table in the checkpoint record (because it has logged no writes for example), we can also skip flushing this table. storage/maria/ma_commit.c: comment is out-of-date storage/maria/ma_key_recover.c: comment fix storage/maria/ma_loghandler.c: comment is out-of-date storage/maria/ma_open.c: comment is out-of-date storage/maria/ma_pagecache.c: comment for bug to fix. And we don't take checkpoints at end of REDO phase yet so can trust block->type. storage/maria/ma_recovery.c: Comments. Now-unneeded code for incomplete REDO-UNDO groups removed. When we forget about an old transaction we must really forget about it with bzero() (fixes the "long_trid!=0 assertion" recovery bug). When we delete a row with maria_delete() we turn on STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both cases a row was deleted). Pick up max_long_trid from the checkpoint record. storage/maria/maria_chk.c: comment storage/maria/maria_def.h: MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and 'pinned_pages'. storage/maria/trnman.c: I used to think that recovery only needs to know the maximum TrID of the lists of active and committed transactions. But no, sometimes both lists can even be empty and their TrID should not be reused. So Checkpoint now saves global_trid_generator in the checkpoint record. storage/maria/trnman_public.h: macros to read/store a TrID mysql-test/r/maria-recovery-bitmap.result: result is ok. Without the code fix, we would get a corruption message about the bitmap page in CHECK TABLE EXTENDED. mysql-test/t/maria-recovery-bitmap-master.opt: usual when we crash mysqld in tests mysql-test/t/maria-recovery-bitmap.test: test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
@todo fix this. pagecache_write() now can have a rec_lsn argument. And we
could make a function which goes through pages at end of REDO phase and
changes their type.
*/
WL#3072 - Maria recovery. * fix for bitmap vs checkpoint bug which could lead to corrupted tables in case of crashes at certain moments: a bitmap could be flushed to disk even though it was inconsistent with the log (it could be flushed before REDO-UNDO are written to the log). One bug remains, need code from others. Tests added. Fix is to pin unflushable bitmap pages, and let checkpoint wait for them to be flushable. * fix for long_trid!=0 assertion failure at Recovery. * less useless wakeups in the background flush|checkpoint thread. * store global_trid_generator in checkpoint record. mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery.test: make it easier to locate subtests storage/maria/ma_bitmap.c: When we send a bitmap to the pagecache, if this bitmap is not in a flushable state we keep it pinned and add it to a list, it will be unpinned when the bitmap is flushable again. A new function _ma_bitmap_flush_all() used by checkpoint. A new function _ma_bitmap_flushable() used by block format to signal when it starts modifying a bitmap and when it is done with it. storage/maria/ma_blockrec.c: When starting a row operation (insert/update/delete), mark that the bitmap is not flushable (because for example INSERT is going to over-allocate in the bitmap to prevent other threads from using our data pages). If a checkpoint comes at this moment it will wait for the bitmap to be flushable before flushing it. When the operation ends, bitmap becomes flushable again; that transition is done under the bitmap's mutex (needed for correct synchro with a concurrent checkpoint); but for INSERT/UPDATE this happens inside _ma_bitmap_release_unused() at a place where it already has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO of INSERT. In case of errors after setting the bitmap unflushable, we must always set it back to flushable or checkpoint would block. Debug possibilities to force a sleep while the bitmap is over-allocated. In case of error in get_head_or_tail() in allocate_and_write_block_record(), we still need to unpin all pages. Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong data_file_length. storage/maria/ma_blockrec.h: new bitmap calls. storage/maria/ma_checkpoint.c: filter_flush_indirect not needed anymore (flushing bitmap pages happens in _ma_bitmap_flush_all() now). So st_filter_param::is_data_file|pages_covered_by_bitmap not needed. Other filter_flush* don't need to flush bitmap anymore. Add debug possibility to flush all bitmap pages outside of a checkpoint, to simulate pagecache LRU eviction. When the background flush/checkpoint thread notices it has nothing to flush, it now sleeps directly until the next potential checkpoint moment instead of waking up every second. When in checkpoint we decide to not store a table in the checkpoint record (because it has logged no writes for example), we can also skip flushing this table. storage/maria/ma_commit.c: comment is out-of-date storage/maria/ma_key_recover.c: comment fix storage/maria/ma_loghandler.c: comment is out-of-date storage/maria/ma_open.c: comment is out-of-date storage/maria/ma_pagecache.c: comment for bug to fix. And we don't take checkpoints at end of REDO phase yet so can trust block->type. storage/maria/ma_recovery.c: Comments. Now-unneeded code for incomplete REDO-UNDO groups removed. When we forget about an old transaction we must really forget about it with bzero() (fixes the "long_trid!=0 assertion" recovery bug). When we delete a row with maria_delete() we turn on STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both cases a row was deleted). Pick up max_long_trid from the checkpoint record. storage/maria/maria_chk.c: comment storage/maria/maria_def.h: MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and 'pinned_pages'. storage/maria/trnman.c: I used to think that recovery only needs to know the maximum TrID of the lists of active and committed transactions. But no, sometimes both lists can even be empty and their TrID should not be reused. So Checkpoint now saves global_trid_generator in the checkpoint record. storage/maria/trnman_public.h: macros to read/store a TrID mysql-test/r/maria-recovery-bitmap.result: result is ok. Without the code fix, we would get a corruption message about the bitmap page in CHECK TABLE EXTENDED. mysql-test/t/maria-recovery-bitmap-master.opt: usual when we crash mysqld in tests mysql-test/t/maria-recovery-bitmap.test: test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
#ifdef FIX_AND_ENABLE_LATER
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
if (take_checkpoints && checkpoint_useful)
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
{
/*
We take a checkpoint as it can save future recovery work if we crash
during the UNDO phase. But we don't flush pages, as UNDOs will change
them again probably.
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
If we wanted to take checkpoints in the middle of the REDO phase, at a
moment when we haven't reached the end of log so don't have exact data
about transactions, we could write a special checkpoint: containing only
the list of dirty pages, otherwise to be treated as if it was at the
same LSN as the last checkpoint.
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
*/
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
if (ma_checkpoint_execute(CHECKPOINT_INDIRECT, FALSE))
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
goto err;
}
#endif
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (should_run_undo_phase)
{
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
if (run_undo_phase(uncommitted_trans))
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
{
ma_message_no_user(0, "Undo phase failed");
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
goto err;
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
else if (uncommitted_trans > 0)
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
{
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
eprint(tracef, "***WARNING: %u uncommitted transactions; some tables may"
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
" be left inconsistent!***", uncommitted_trans);
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
recovery_warnings++;
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
if (skipped_undo_phase)
{
/*
We could want to print a list of tables for which UNDOs were skipped,
but not one line per skipped UNDO.
*/
eprint(tracef, "***WARNING: %lu UNDO records skipped in UNDO phase; some"
" tables may be left inconsistent!***", skipped_undo_phase);
recovery_warnings++;
}
old_now= now;
now= my_getsystime();
if (recovery_message_printed == REC_MSG_UNDO)
{
Added --with-maria-tmp-tables (default one) to allow on to configure if Maria should be used for internal temporary tables Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables Fixed bug that caused update of big blobs to crash Use pagecache_page_no_t as type for pages (to get rid of compiler warnings) Added cast to get rid of compiler warning Fixed wrong types of variables and arguments that caused lost information Fixed wrong DBUG_ASSERT() that caused REDO of big blobs to fail Removed some historical ifdefs that caused problem with windows compilations BUILD/SETUP.sh: Added --with-maria-tmp-tables include/maria.h: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option include/my_base.h: Added comment mysql-test/r/maria-big.result: Added test that uses big blobs mysql-test/t/maria-big.test: Added test that uses big blobs sql/mysqld.cc: Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables sql/sql_class.h: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined sql/sql_select.cc: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined storage/maria/ha_maria.cc: Fixed compiler warnings reported by MCC - Fixed usage of wrong types that caused data loss - Changed parameter for rep_quick to my_bool - Added safe casts Fixed indentation storage/maria/ma_bitmap.c: Use pagecache_page_no_t as type for pages Fixed compiler warnings Fixed bug that caused update of big blobs to crash storage/maria/ma_blockrec.c: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option Fixed compiler warnings Fixed wrong DBUG_ASSERT() storage/maria/ma_blockrec.h: Use pagecache_page_no_t as type for pages storage/maria/ma_check.c: Fixed some wrong parameters where we didn't get all bits for test_flag Changed rep_quick to be of type my_bool Use pagecache_page_no_t as type for pages Added cast's to get rid of compiler warnings Changed type of record_pos to get rid of compiler warning storage/maria/ma_create.c: Added safe cast's to get rid of compiler warnings storage/maria/ma_dynrec.c: Fixed usage of wrong type storage/maria/ma_key.c: Fixed compiler warning storage/maria/ma_key_recover.c: Use pagecache_page_no_t as type for pages storage/maria/ma_loghandler_lsn.h: Added cast's to get rid of compiler warnings storage/maria/ma_page.c: Changed variable name from 'page' to 'pos' as it was an offset and not a page address Moved page_size inside block to get rid of compiler warning storage/maria/ma_pagecache.c: Fixed compiler warnings Replaced compile time assert with TODO storage/maria/ma_pagecache.h: Use pagecache_page_no_t as type for pages storage/maria/ma_pagecrc.c: Allow bitmap pages that is all zero storage/maria/ma_preload.c: Added cast to get rid of compiler warning storage/maria/ma_recovery.c: Changed types to get rid of compiler warnings Use bool for quick_repair to get rid of compiler warning Fixed some variables that was wrongly declared (not enough precission) Added cast to get rid of compiler warning storage/maria/ma_test2.c: Remove historical undefs storage/maria/maria_chk.c: Changed rep_quick to bool Fixed wrong parameter to maria_chk_data_link() storage/maria/maria_def.h: Use pagecache_page_no_t as type for pages storage/maria/maria_pack.c: Renamed isam -> maria storage/maria/plug.in: Added option --with-maria-tmp-tables storage/maria/trnman.c: Added cast to get rid of compiler warning storage/myisam/mi_test2.c: Remove historical undefs
2008-01-10 20:21:36 +01:00
double phase_took= (now - old_now)/10000000.0;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 1;
fprintf(stderr, " (%.1f seconds); ", phase_took);
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
fflush(stderr);
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/*
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
we don't use maria_panic() because it would maria_end(), and Recovery does
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
not want that (we want to keep some modules initialized for runtime).
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
*/
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (close_all_tables())
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
{
ma_message_no_user(0, "closing of tables failed");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto err;
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
old_now= now;
now= my_getsystime();
if (recovery_message_printed == REC_MSG_FLUSH)
{
Added --with-maria-tmp-tables (default one) to allow on to configure if Maria should be used for internal temporary tables Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables Fixed bug that caused update of big blobs to crash Use pagecache_page_no_t as type for pages (to get rid of compiler warnings) Added cast to get rid of compiler warning Fixed wrong types of variables and arguments that caused lost information Fixed wrong DBUG_ASSERT() that caused REDO of big blobs to fail Removed some historical ifdefs that caused problem with windows compilations BUILD/SETUP.sh: Added --with-maria-tmp-tables include/maria.h: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option include/my_base.h: Added comment mysql-test/r/maria-big.result: Added test that uses big blobs mysql-test/t/maria-big.test: Added test that uses big blobs sql/mysqld.cc: Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables sql/sql_class.h: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined sql/sql_select.cc: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined storage/maria/ha_maria.cc: Fixed compiler warnings reported by MCC - Fixed usage of wrong types that caused data loss - Changed parameter for rep_quick to my_bool - Added safe casts Fixed indentation storage/maria/ma_bitmap.c: Use pagecache_page_no_t as type for pages Fixed compiler warnings Fixed bug that caused update of big blobs to crash storage/maria/ma_blockrec.c: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option Fixed compiler warnings Fixed wrong DBUG_ASSERT() storage/maria/ma_blockrec.h: Use pagecache_page_no_t as type for pages storage/maria/ma_check.c: Fixed some wrong parameters where we didn't get all bits for test_flag Changed rep_quick to be of type my_bool Use pagecache_page_no_t as type for pages Added cast's to get rid of compiler warnings Changed type of record_pos to get rid of compiler warning storage/maria/ma_create.c: Added safe cast's to get rid of compiler warnings storage/maria/ma_dynrec.c: Fixed usage of wrong type storage/maria/ma_key.c: Fixed compiler warning storage/maria/ma_key_recover.c: Use pagecache_page_no_t as type for pages storage/maria/ma_loghandler_lsn.h: Added cast's to get rid of compiler warnings storage/maria/ma_page.c: Changed variable name from 'page' to 'pos' as it was an offset and not a page address Moved page_size inside block to get rid of compiler warning storage/maria/ma_pagecache.c: Fixed compiler warnings Replaced compile time assert with TODO storage/maria/ma_pagecache.h: Use pagecache_page_no_t as type for pages storage/maria/ma_pagecrc.c: Allow bitmap pages that is all zero storage/maria/ma_preload.c: Added cast to get rid of compiler warning storage/maria/ma_recovery.c: Changed types to get rid of compiler warnings Use bool for quick_repair to get rid of compiler warning Fixed some variables that was wrongly declared (not enough precission) Added cast to get rid of compiler warning storage/maria/ma_test2.c: Remove historical undefs storage/maria/maria_chk.c: Changed rep_quick to bool Fixed wrong parameter to maria_chk_data_link() storage/maria/maria_def.h: Use pagecache_page_no_t as type for pages storage/maria/maria_pack.c: Renamed isam -> maria storage/maria/plug.in: Added option --with-maria-tmp-tables storage/maria/trnman.c: Added cast to get rid of compiler warning storage/myisam/mi_test2.c: Remove historical undefs
2008-01-10 20:21:36 +01:00
double phase_took= (now - old_now)/10000000.0;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 1;
fprintf(stderr, " (%.1f seconds); ", phase_took);
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
fflush(stderr);
}
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
if (take_checkpoints && checkpoint_useful)
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
{
/* No dirty pages, all tables are closed, no active transactions, save: */
if (ma_checkpoint_execute(CHECKPOINT_FULL, FALSE))
goto err;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
err:
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
tprint(tracef, "\nRecovery of tables with transaction logs FAILED\n");
err2:
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
if (trns_created)
delete_all_transactions();
error= 1;
if (close_all_tables())
{
ma_message_no_user(0, "closing of tables failed");
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
end:
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
error_handler_hook= save_error_handler_hook;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
hash_free(&all_dirty_pages);
bzero(&all_dirty_pages, sizeof(all_dirty_pages));
my_free(dirty_pages_pool, MYF(MY_ALLOW_ZERO_PTR));
dirty_pages_pool= NULL;
post-merge fixes, and fixes for some of the 16 compiler warnings found in pushbuild on sapsrv1. Some not fixed as not repeatable on my machine (32/64 bit issue?). Fixes for some test failures: - "maria-connect" now passes; - "maria": after fixing the obvious reasons for failures, the test went further and hit a more complex issues: difference in the output of EXPLAIN output; not fixed; - "ps_maria" still crashes in assertion mysqld: ha_maria.cc:1627: virtual int ha_maria::index_read(uchar*, const uchar*, uint, ha_rkey_function): Ass ertion `inited == INDEX' failed, as already observable in pushbuild. All this might just be due to an incomplete merge of MyISAM changes into Maria when 5.1 was last merged to mysql-maria. include/my_global.h: temporary fix until next merge of 5.1; without this it does not build mysql-test/r/maria-connect.result: position changed mysql-test/t/maria-connect.test: If one wants to use the binlog it has to ask for it. 1582 is not used for dup entry error anymore (it was in older 5.1). Size of first event in binlog was increased by 4 (when the new type of event "gap" was added). mysql-test/t/maria.test: 1582 not used anymore in this case storage/maria/ha_maria.cc: engine now has to say what binlogging it supports storage/maria/ma_blockrec.c: fix for compiler warnings ("comparison is always true" or "always false") storage/maria/ma_loghandler.c: fix for compiler warnings (comparing char* to uchar*) storage/maria/ma_packrec.c: fix for compiler warning (fix simply merged from MyISAM) storage/maria/ma_pagecache.c: info_check_pin() was not used so gave a compiler warning. storage/maria/ma_pagecache.h: fixing typo from the last 5.1->maria merge. storage/maria/ma_recovery.c: my_free() has a void* argument, so why cast. byte->uchar. storage/maria/ma_search.c: fix for compiler warning (fix simply merged from MyISAM) storage/maria/maria_read_log.c: gptr->uchar* storage/maria/trnman.c: probable fix for warning found in pushbuild (but not on my machine): storage/maria/trnman.c: 142 passing argument 6 of \u2018lf_hash_init\u2019 from incompatible pointer type on sapsrv1.
2007-07-26 17:51:49 +02:00
my_free(all_tables, MYF(MY_ALLOW_ZERO_PTR));
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_tables= NULL;
post-merge fixes, and fixes for some of the 16 compiler warnings found in pushbuild on sapsrv1. Some not fixed as not repeatable on my machine (32/64 bit issue?). Fixes for some test failures: - "maria-connect" now passes; - "maria": after fixing the obvious reasons for failures, the test went further and hit a more complex issues: difference in the output of EXPLAIN output; not fixed; - "ps_maria" still crashes in assertion mysqld: ha_maria.cc:1627: virtual int ha_maria::index_read(uchar*, const uchar*, uint, ha_rkey_function): Ass ertion `inited == INDEX' failed, as already observable in pushbuild. All this might just be due to an incomplete merge of MyISAM changes into Maria when 5.1 was last merged to mysql-maria. include/my_global.h: temporary fix until next merge of 5.1; without this it does not build mysql-test/r/maria-connect.result: position changed mysql-test/t/maria-connect.test: If one wants to use the binlog it has to ask for it. 1582 is not used for dup entry error anymore (it was in older 5.1). Size of first event in binlog was increased by 4 (when the new type of event "gap" was added). mysql-test/t/maria.test: 1582 not used anymore in this case storage/maria/ha_maria.cc: engine now has to say what binlogging it supports storage/maria/ma_blockrec.c: fix for compiler warnings ("comparison is always true" or "always false") storage/maria/ma_loghandler.c: fix for compiler warnings (comparing char* to uchar*) storage/maria/ma_packrec.c: fix for compiler warning (fix simply merged from MyISAM) storage/maria/ma_pagecache.c: info_check_pin() was not used so gave a compiler warning. storage/maria/ma_pagecache.h: fixing typo from the last 5.1->maria merge. storage/maria/ma_recovery.c: my_free() has a void* argument, so why cast. byte->uchar. storage/maria/ma_search.c: fix for compiler warning (fix simply merged from MyISAM) storage/maria/maria_read_log.c: gptr->uchar* storage/maria/trnman.c: probable fix for warning found in pushbuild (but not on my machine): storage/maria/trnman.c: 142 passing argument 6 of \u2018lf_hash_init\u2019 from incompatible pointer type on sapsrv1.
2007-07-26 17:51:49 +02:00
my_free(all_active_trans, MYF(MY_ALLOW_ZERO_PTR));
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_active_trans= NULL;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
my_free(log_record_buffer.str, MYF(MY_ALLOW_ZERO_PTR));
log_record_buffer.str= NULL;
log_record_buffer.length= 0;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
ma_checkpoint_end();
Fix for LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery More DBUG_PRINT (to simplify future debugging) Aria: Added STATE_IN_REPAIR, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Aria: Some trivial speedup optimization Aria: Better warning if table was marked crashed by unfinnished repair mysql-test/lib/v1/mysql-test-run.pl: Fix so one can run RQG mysql-test/suite/maria/r/maria-recovery2.result: Update for new error message. mysys/stacktrace.c: Fixed compiler warning storage/maria/ha_maria.cc: More DBUG_PRINT Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Don't log query for dropping temporary table. storage/maria/ha_maria.h: Added prototype for drop_table() storage/maria/ma_blockrec.c: More DBUG_PRINT Make read_long_data() inline for most cases. (Trivial speedup optimization) storage/maria/ma_check.c: Better warning if table was marked crashed by unfinnished repair storage/maria/ma_open.c: More DBUG_PRINT storage/maria/ma_recovery.c: Give warning if found crashed table. Changed warning for tables that can't be opened. storage/maria/ma_recovery_util.c: Write warnings to DBUG file storage/maria/maria_chk.c: Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. storage/maria/maria_def.h: Added maria_mark_in_repair(x) storage/maria/maria_read_log.c: Added option: --character-sets-dir storage/maria/trnman.c: By default set min_read_from to max value. This allows us to remove TRN:s from rows during recovery to get more space. This fixes bug LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery
2010-07-30 09:45:27 +02:00
*warnings_count= recovery_warnings + recovery_found_crashed_tables;
if (recovery_message_printed != REC_MSG_NONE)
{
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
if (procent_printed)
{
procent_printed= 0;
fprintf(stderr, "\n");
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
fflush(stderr);
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
}
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
if (!error)
Fixed some bugs in the Maria storage engine - Changed default recovery mode from OFF to NORMAL to get automatic repair of not properly closed tables. - Fixed a rase condition when two threads calls external_lock and thr_lock() in different order. When this happend the transaction that called external lock first and thr_lock() last did not see see the rows from the other transaction, even if if it had to wait in thr_lock() for other to complete. - Fixed that one can run maria_chk on an automatcally recovered tables without warnings about too small transaction id - Don't give warning that crashed table could not be repaired if repair was disabled (and thus not run) - Fixed a error result from flush_key_cache() which caused a DBUG_ASSERT() when one was using concurrent reads on non transactional tables that was updated. client/mysqldump.c: Add "" around error message to make it more readable client/mysqltest.cc: Free environment variables mysql-test/r/mysqldump.result: Updated results mysql-test/r/openssl_1.result: Updated results mysql-test/suite/maria/r/maria-recover.result: Updated results mysql-test/suite/maria/r/maria3.result: Updated results mysql-test/suite/maria/t/maria3.test: Added more test of temporary tables storage/maria/ha_maria.cc: Changed default recovery mode from OFF to NORMAL to get automatic repair of not properly closed tables. Start transaction in ma_block_get_status() instead of in ha_maria::external_lock(). - This fixes a rase condition when two threads calls external lock and thr_lock() in different order. When this happend the transaction that called external lock first and thr_lock() last did not see see the rows from the other transaction, even if if it had to wait in thr_lock() for other to complete. Store latest transaction id in controll file if recovery was done. - This allows one to run maria_chk on an automatcally recovered tables without warnings about too small transaction id storage/maria/ha_maria.h: Don't give warning that crashed table could not be repaired if repair was disabled (and thus not run) storage/maria/ma_blockrec.h: Added new function "_ma_block_get_status_no_versioning()" storage/maria/ma_init.c: Added hook to create trn in ma_block_get_status() if we are using MariaDB storage/maria/ma_open.c: Ensure we call _ma_block_get_status_no_versioning() for transactional tables without versioning (like tables with fulltext) storage/maria/ma_pagecache.c: Allow one to flush blocks that are pinned for read. This fixed a error result from flush_key_cache() which caused a DBUG_ASSERT() when one was using concurrent reads on non transactional tables that was updated. storage/maria/ma_recovery.c: Set maria_recovery_changed_data to 1 if recover changed something. Set max_trid_in_control_file to max found trn if we found a bigger trn. The allows will ensure that the control file is up to date after recovery which allows one to run maria_chk on the tables without warnings about too big trn storage/maria/ma_state.c: Call maria_create_trn_hook() in _ma_setup_live_state() instead of ha_maria::external_lock() This ensures that 'state' and trn are in sync and thus fixes the race condition mentioned for ha_maria.cc storage/maria/ma_static.c: Added maria_create_trn_hook() and maria_recovery_changed_data storage/maria/maria_def.h: Added MARIA_HANDLER->external_ptr, which is used to hold MariaDB thd. Added some new external variables Removed reference to non existing function: maria_concurrent_inserts()
2010-06-14 00:13:32 +02:00
{
WL#3071 Maria checkpoint, WL#3072 Maria recovery instead of fprintf(stderr) when a task (with no user connected) gets an error, use my_printf_error(). Flags ME_JUST_WARNING and ME_JUST_INFO added to my_error()/my_printf_error(), which pass it to my_message_sql() which is modified to call the appropriate sql_print_*(). This way recovery can signal its start and end with [Note] and not [ERROR] (but failure with [ERROR]). Recovery's detailed progress (percents etc) still uses stderr as they have to stay on one single line. sql_print_error() changed to use my_progname_short (nicer display). mysql-test-run.pl --gdb/--ddd does not run mysqld, because a breakpoint in mysql_parse is too late to debug startup problems; instead, dev should set the breakpoints it wants and then "run" ("r"). include/my_sys.h: new flags to tell error_handler_hook that this is not an error but an information or warning mysql-test/mysql-test-run.pl: when running with --gdb/--ddd to debug mysqld, breaking at mysql_parse is too late to debug startup problems; now, it does not run mysqld, does not set breakpoints, developer can set as early breakpoints as it wants and is responsible for typing "run" (or "r") mysys/my_init.c: set my_progname_short mysys/my_static.c: my_progname_short added sql/mysqld.cc: * my_message_sql() can now receive info or warning, not only error; this allows mysys to tell the user (or the error log if no user) about an info or warning. Used from Maria. * plugins (or engines like Maria) may want to call my_error(), so set up the error handler hook (my_message_sql) before initializing plugins; otherwise they get my_message_no_curses which is less integrated into mysqld (is just fputs()) * using my_progname_short instead of my_progname, in my_message_sql() (less space on screen) storage/maria/ma_checkpoint.c: fprintf(stderr) -> ma_message_no_user() storage/maria/ma_checkpoint.h: function for any Maria task, not connected to a user (example: checkpoint, recovery; soon could be deleted records purger) to report a message (calls my_printf_error() which, when inside ha_maria, leads to sql_print_*(), and when outside, leads to my_message_no_curses i.e. stderr). storage/maria/ma_recovery.c: To tell that recovery starts and ends we use ma_message_no_user() (sql_print_*() in practice). Detailed progress info still uses stderr as sql_print() cannot put several messages on one line. 071116 18:42:16 [Note] mysqld: Maria engine: starting recovery recovered pages: 0% 67% 100% (0.0 seconds); transactions to roll back: 1 0 (0.0 seconds); tables to flush: 1 0 (0.0 seconds); 071116 18:42:16 [Note] mysqld: Maria engine: recovery done storage/maria/maria_chk.c: my_progname_short moved to mysys storage/maria/maria_read_log.c: my_progname_short moved to mysys storage/myisam/myisamchk.c: my_progname_short moved to mysys
2007-11-16 17:09:51 +01:00
ma_message_no_user(ME_JUST_INFO, "recovery done");
Fixed some bugs in the Maria storage engine - Changed default recovery mode from OFF to NORMAL to get automatic repair of not properly closed tables. - Fixed a rase condition when two threads calls external_lock and thr_lock() in different order. When this happend the transaction that called external lock first and thr_lock() last did not see see the rows from the other transaction, even if if it had to wait in thr_lock() for other to complete. - Fixed that one can run maria_chk on an automatcally recovered tables without warnings about too small transaction id - Don't give warning that crashed table could not be repaired if repair was disabled (and thus not run) - Fixed a error result from flush_key_cache() which caused a DBUG_ASSERT() when one was using concurrent reads on non transactional tables that was updated. client/mysqldump.c: Add "" around error message to make it more readable client/mysqltest.cc: Free environment variables mysql-test/r/mysqldump.result: Updated results mysql-test/r/openssl_1.result: Updated results mysql-test/suite/maria/r/maria-recover.result: Updated results mysql-test/suite/maria/r/maria3.result: Updated results mysql-test/suite/maria/t/maria3.test: Added more test of temporary tables storage/maria/ha_maria.cc: Changed default recovery mode from OFF to NORMAL to get automatic repair of not properly closed tables. Start transaction in ma_block_get_status() instead of in ha_maria::external_lock(). - This fixes a rase condition when two threads calls external lock and thr_lock() in different order. When this happend the transaction that called external lock first and thr_lock() last did not see see the rows from the other transaction, even if if it had to wait in thr_lock() for other to complete. Store latest transaction id in controll file if recovery was done. - This allows one to run maria_chk on an automatcally recovered tables without warnings about too small transaction id storage/maria/ha_maria.h: Don't give warning that crashed table could not be repaired if repair was disabled (and thus not run) storage/maria/ma_blockrec.h: Added new function "_ma_block_get_status_no_versioning()" storage/maria/ma_init.c: Added hook to create trn in ma_block_get_status() if we are using MariaDB storage/maria/ma_open.c: Ensure we call _ma_block_get_status_no_versioning() for transactional tables without versioning (like tables with fulltext) storage/maria/ma_pagecache.c: Allow one to flush blocks that are pinned for read. This fixed a error result from flush_key_cache() which caused a DBUG_ASSERT() when one was using concurrent reads on non transactional tables that was updated. storage/maria/ma_recovery.c: Set maria_recovery_changed_data to 1 if recover changed something. Set max_trid_in_control_file to max found trn if we found a bigger trn. The allows will ensure that the control file is up to date after recovery which allows one to run maria_chk on the tables without warnings about too big trn storage/maria/ma_state.c: Call maria_create_trn_hook() in _ma_setup_live_state() instead of ha_maria::external_lock() This ensures that 'state' and trn are in sync and thus fixes the race condition mentioned for ha_maria.cc storage/maria/ma_static.c: Added maria_create_trn_hook() and maria_recovery_changed_data storage/maria/maria_def.h: Added MARIA_HANDLER->external_ptr, which is used to hold MariaDB thd. Added some new external variables Removed reference to non existing function: maria_concurrent_inserts()
2010-06-14 00:13:32 +02:00
maria_recovery_changed_data= 1;
}
}
Fixed some bugs in the Maria storage engine - Changed default recovery mode from OFF to NORMAL to get automatic repair of not properly closed tables. - Fixed a rase condition when two threads calls external_lock and thr_lock() in different order. When this happend the transaction that called external lock first and thr_lock() last did not see see the rows from the other transaction, even if if it had to wait in thr_lock() for other to complete. - Fixed that one can run maria_chk on an automatcally recovered tables without warnings about too small transaction id - Don't give warning that crashed table could not be repaired if repair was disabled (and thus not run) - Fixed a error result from flush_key_cache() which caused a DBUG_ASSERT() when one was using concurrent reads on non transactional tables that was updated. client/mysqldump.c: Add "" around error message to make it more readable client/mysqltest.cc: Free environment variables mysql-test/r/mysqldump.result: Updated results mysql-test/r/openssl_1.result: Updated results mysql-test/suite/maria/r/maria-recover.result: Updated results mysql-test/suite/maria/r/maria3.result: Updated results mysql-test/suite/maria/t/maria3.test: Added more test of temporary tables storage/maria/ha_maria.cc: Changed default recovery mode from OFF to NORMAL to get automatic repair of not properly closed tables. Start transaction in ma_block_get_status() instead of in ha_maria::external_lock(). - This fixes a rase condition when two threads calls external lock and thr_lock() in different order. When this happend the transaction that called external lock first and thr_lock() last did not see see the rows from the other transaction, even if if it had to wait in thr_lock() for other to complete. Store latest transaction id in controll file if recovery was done. - This allows one to run maria_chk on an automatcally recovered tables without warnings about too small transaction id storage/maria/ha_maria.h: Don't give warning that crashed table could not be repaired if repair was disabled (and thus not run) storage/maria/ma_blockrec.h: Added new function "_ma_block_get_status_no_versioning()" storage/maria/ma_init.c: Added hook to create trn in ma_block_get_status() if we are using MariaDB storage/maria/ma_open.c: Ensure we call _ma_block_get_status_no_versioning() for transactional tables without versioning (like tables with fulltext) storage/maria/ma_pagecache.c: Allow one to flush blocks that are pinned for read. This fixed a error result from flush_key_cache() which caused a DBUG_ASSERT() when one was using concurrent reads on non transactional tables that was updated. storage/maria/ma_recovery.c: Set maria_recovery_changed_data to 1 if recover changed something. Set max_trid_in_control_file to max found trn if we found a bigger trn. The allows will ensure that the control file is up to date after recovery which allows one to run maria_chk on the tables without warnings about too big trn storage/maria/ma_state.c: Call maria_create_trn_hook() in _ma_setup_live_state() instead of ha_maria::external_lock() This ensures that 'state' and trn are in sync and thus fixes the race condition mentioned for ha_maria.cc storage/maria/ma_static.c: Added maria_create_trn_hook() and maria_recovery_changed_data storage/maria/maria_def.h: Added MARIA_HANDLER->external_ptr, which is used to hold MariaDB thd. Added some new external variables Removed reference to non existing function: maria_concurrent_inserts()
2010-06-14 00:13:32 +02:00
else if (!error && max_trid_in_control_file != max_long_trid)
{
/*
maria_end() will set max trid in log file so that one can run
maria_chk on the tables
*/
Fixed some bugs in the Maria storage engine - Changed default recovery mode from OFF to NORMAL to get automatic repair of not properly closed tables. - Fixed a rase condition when two threads calls external_lock and thr_lock() in different order. When this happend the transaction that called external lock first and thr_lock() last did not see see the rows from the other transaction, even if if it had to wait in thr_lock() for other to complete. - Fixed that one can run maria_chk on an automatcally recovered tables without warnings about too small transaction id - Don't give warning that crashed table could not be repaired if repair was disabled (and thus not run) - Fixed a error result from flush_key_cache() which caused a DBUG_ASSERT() when one was using concurrent reads on non transactional tables that was updated. client/mysqldump.c: Add "" around error message to make it more readable client/mysqltest.cc: Free environment variables mysql-test/r/mysqldump.result: Updated results mysql-test/r/openssl_1.result: Updated results mysql-test/suite/maria/r/maria-recover.result: Updated results mysql-test/suite/maria/r/maria3.result: Updated results mysql-test/suite/maria/t/maria3.test: Added more test of temporary tables storage/maria/ha_maria.cc: Changed default recovery mode from OFF to NORMAL to get automatic repair of not properly closed tables. Start transaction in ma_block_get_status() instead of in ha_maria::external_lock(). - This fixes a rase condition when two threads calls external lock and thr_lock() in different order. When this happend the transaction that called external lock first and thr_lock() last did not see see the rows from the other transaction, even if if it had to wait in thr_lock() for other to complete. Store latest transaction id in controll file if recovery was done. - This allows one to run maria_chk on an automatcally recovered tables without warnings about too small transaction id storage/maria/ha_maria.h: Don't give warning that crashed table could not be repaired if repair was disabled (and thus not run) storage/maria/ma_blockrec.h: Added new function "_ma_block_get_status_no_versioning()" storage/maria/ma_init.c: Added hook to create trn in ma_block_get_status() if we are using MariaDB storage/maria/ma_open.c: Ensure we call _ma_block_get_status_no_versioning() for transactional tables without versioning (like tables with fulltext) storage/maria/ma_pagecache.c: Allow one to flush blocks that are pinned for read. This fixed a error result from flush_key_cache() which caused a DBUG_ASSERT() when one was using concurrent reads on non transactional tables that was updated. storage/maria/ma_recovery.c: Set maria_recovery_changed_data to 1 if recover changed something. Set max_trid_in_control_file to max found trn if we found a bigger trn. The allows will ensure that the control file is up to date after recovery which allows one to run maria_chk on the tables without warnings about too big trn storage/maria/ma_state.c: Call maria_create_trn_hook() in _ma_setup_live_state() instead of ha_maria::external_lock() This ensures that 'state' and trn are in sync and thus fixes the race condition mentioned for ha_maria.cc storage/maria/ma_static.c: Added maria_create_trn_hook() and maria_recovery_changed_data storage/maria/maria_def.h: Added MARIA_HANDLER->external_ptr, which is used to hold MariaDB thd. Added some new external variables Removed reference to non existing function: maria_concurrent_inserts()
2010-06-14 00:13:32 +02:00
maria_recovery_changed_data= 1;
}
if (error && !abort_message_printed)
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
my_message(HA_ERR_INITIALIZATION,
"Maria recovery failed. Please run maria_chk -r on all maria "
"tables and delete all maria_log.######## files", MYF(0));
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 0;
WL#4374 "Maria - force start if Recovery fails multiple times" http://forge.mysql.com/worklog/task.php?id=4374 new option --maria-force-start-after-recovery-failures=N; number of consecutive recovery failures (failures of log reading or recovery processing, anything in [translog_init(),maria_recovery_from_log()]) is stored in the control file; if at a Maria start they are more than N, logs are removed. This is for automated systems which have to run whatever happens. As tables risk staying corrupted, --maria-recover should also be used on them: this revision makes maria-recover work (it was disabled). Fixed bug in translog_is_log_files(). translog_init() now prints message to error log if failed. Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there. KNOWN_BUGS.txt: As option --maria-force-start-after-recovery-failures is added, it corresponds to the wish "we should fix that if this happens etc". LOAD INDEX is not ignored since a few weeks. Listed concurrency bugs have been fixed some time ago. Recovery of fulltext and GIS indexes works since a few weeks. mysql-test/include/maria_make_snapshot.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/include/maria_make_snapshot_for_comparison.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/include/maria_verify_recovery.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/lib/mtr_report.pl: new test maria-recover.test generates expected corruption warnings in the error log. maria-recovery.test's corrupted table is renamed to t_corrupted1 instead of t1. mysql-test/r/maria-preload.result: result update. maria_pagecache_read* values are similar to the previous version of this file, though a bit bigger because using the information_schema and the join leads to some internal maria temp table being used, and thus some blocks of it being read. mysql-test/r/maria-purge.result: engine's name in SHOW ENGINE MARIA LOGS changed. mysql-test/r/maria-recover.result: result for new test. We see corruption messages at first SELECT and then none at second SELECT, expected. mysql-test/r/maria-recovery.result: result update mysql-test/r/maria.result: new variables show up mysql-test/t/disabled.def: BUG#34911 is not fixed but the test had been made independent of the bug (workaround). A new bug (crash) has popped recently, so it has to stay disabled (BUG#35107). mysql-test/t/maria-preload.test: Work around BUG#34911 "FLUSH STATUS doesn't flush what it should": compute differences in status variables before and after relevant queries mysql-test/t/maria-recover-master.opt: test --maria-recover mysql-test/t/maria-recover.test: Test of the --maria-recover option (build a corrupted table and see if it is auto-repaired) mysql-test/t/maria-recovery-big.test: update for new API of include/maria*.inc mysql-test/t/maria-recovery-bitmap.test: update for new API of include/maria*.inc mysql-test/t/maria-recovery.test: update for new API of include/maria*.inc. Corrupted table t1 renamed to t_corrupted1, so that mtr_report.pl does not blindly remove all corruption messages for t1 which is a common name. storage/maria/ha_maria.cc: Enabling maria-recover. Adding option and global variable --maria_force_start_after_recovery_failures: ha_maria_init() calls mark_recovery_start() and mark_recovery_success() to keep track of failed consecutive recoveries and remove logs if needed. Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there. storage/maria/ma_checkpoint.c: new prototype storage/maria/ma_control_file.c: Storing in one byte in the control file, the number of consecutive recovery failures. storage/maria/ma_control_file.h: new prototype storage/maria/ma_init.c: new prototype storage/maria/ma_locking.c: Need to update open_count on disk at first write and close for transactional tables, like we already did for non-transactional tables, otherwise we cannot notice that the table is dubious. storage/maria/ma_loghandler.c: translog_is_log_files() is made more generic to serve either to search or to delete logs (the latter is for --maria-force-start-after-recovery-failures). It also had a bug (always returned FALSE). storage/maria/ma_loghandler.h: export function because ha_maria::mark_recovery_start() needs it storage/maria/ma_recovery.c: changing name of maria_recover() to distinguish from the maria-recover option. storage/maria/ma_recovery.h: changing name of maria_recover() to distinguish from the maria-recover option. storage/maria/ma_test_force_start.pl: Test of --maria-force-start-after-recovery-failures (and also, to be realistic, of --maria-recover). This is standalone because mysql-test-run does not support testing that multiple mysqld restarts expectedly failed. I'll have to run it on my machine and also on a Windows machine. storage/maria/unittest/ma_control_file-t.c: adding recovery_failures to the test storage/maria/unittest/ma_test_loghandler_multigroup-t.c: fix for compiler warning (unused variable in non-debug build)
2008-06-02 22:53:25 +02:00
/*
We don't cleanly close tables if we hit some error (may corrupt them by
flushing some wrong blocks made from wrong REDOs). It also leaves their
open_count>0, which ensures that --maria-recover, if used, will try to
repair them.
*/
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
DBUG_RETURN(error);
}
/* very basic info about the record's header */
static void display_record_position(const LOG_DESC *log_desc,
const TRANSLOG_HEADER_BUFFER *rec,
uint number)
{
/*
if number==0, we're going over records which we had already seen and which
form a group, so we indent below the group's end record
*/
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
tprint(tracef,
"%sRec#%u LSN (%lu,0x%lx) short_trid %u %s(num_type:%u) len %lu\n",
number ? "" : " ", number, LSN_IN_PARTS(rec->lsn),
rec->short_trid, log_desc->name, rec->type,
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
(ulong)rec->record_length);
if (rec->type == LOGREC_DEBUG_INFO)
{
/* Print some extra information */
(*log_desc->record_execute_in_redo_phase)(rec);
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
static int display_and_apply_record(const LOG_DESC *log_desc,
const TRANSLOG_HEADER_BUFFER *rec)
{
int error;
if (log_desc->record_execute_in_redo_phase == NULL)
{
/* die on all not-yet-handled records :) */
Fixed bug that 'maria_read_log -a' didn't set max_trid when reparing tables. Fixed bug in Aria when replacing short keys with long keys and a key tree both overflow and underflow at same time. Fixed several bugs when generating recovery logs when using RGQ with replacing long keys with short keys and vice versa. Lots of new DBUG_ASSERT()'s Added more information to recovery log to make it easier to know from where log entry orginated. Introduced MARIA_PAGE->org_size that tells what the size of the page was in last log entry. This allows us to find out if all key changes for index page was logged. Small code cleanups: - Introduced _ma_log_key_changes() to log crc of key page changes - Added share->max_index_block_size as max size of data one can put in key block (block_size - KEYPAGE_CHECKSUM_SIZE) This will later simplify adding a directory to index pages. - Write page number instead of page postition to DBUG log mysql-test/lib/v1/mysql-test-run.pl: Use --general-log instead of --log to disable warning when using RQG sql/mysqld.cc: If we have already sent ok to client when we get an error, log this to stderr Don't disable option --log-output if CSV engine is not supported. storage/maria/ha_maria.cc: Log queries to recovery log also in LOCK TABLES storage/maria/ma_check.c: If param->max_trid is set, use this value instead of max_trid_in_system(). This is used by recovery to set max_trid to max seen trid so far. keyinfo->block_length - KEYPAGE_CHECKSUM_SIZE -> max_index_block_size (Style optimization) storage/maria/ma_delete.c: Mark tables crashed early Write page number instead of page position to debug log. Added parameter to ma_log_delete() and ma_log_prefix() that is logged so that we can find where wrong log entries where generated. Fixed bug where a page was not proplerly written when same key tree had both an overflow and underflow when deleting a key. keyinfo->block_length - KEYPAGE_CHECKSUM_SIZE => max_index_block_size (Style optimization) ma_log_delete() now has extra parameter of how many bytes from end of page should be appended to log for page (for page overflows) storage/maria/ma_key_recover.c: Added extra parameter to ma_log_prefix() to indicate what caused log entry. Update MARIA_PAGE->org_size when logging info about page. Much more DBUG_ASSERT()'s. Fix some bugs in maria_log_add() to handle page overflows. Added _ma_log_key_changes() to log crc of key page changes. If EXTRA_STORE_FULL_PAGE_IN_KEY_CHANGES is defines, log the resulting pages to log so one can trivally see how the resulting page should have looked like (for errors in CRC values) storage/maria/ma_key_recover.h: Added _ma_log_key_changes() which is only called if EXTRA_DEBUG_KEY_CHANGES is defined. Updated function prototypes. storage/maria/ma_loghandler.h: Added more values to en_key_debug, to get more exact location where things went wrong when logging to recovery log. storage/maria/ma_open.c: Initialize share->max_index_block_size storage/maria/ma_page.c: Added updating and testing of MARIA_PAGE->org_size Write page number instead of page postition to DBUG log Generate error if we read page with wrong data. Removed wrong assert: key_del_current != share->state.key_del. Simplify _ma_log_compact_keypage() storage/maria/ma_recovery.c: Set param.max_trid to max seen trid before running repair table (used for alter table to create index) storage/maria/ma_rt_key.c: Update call to _ma_log_delete() storage/maria/ma_rt_split.c: Use _ma_log_key_changes() Update MARIA_PAGE->org_size storage/maria/ma_unique.c: Remove casts storage/maria/ma_write.c: keyinfo->block_length - KEYPAGE_CHECKSUM_SIZE => share->max_index_block_length. Updated calls to _ma_log_prefix() Changed code to use _ma_log_key_changes() Update ma_page->org_size Fixed bug in _ma_log_split() for pages that overflow Added KEY_OP_DEBUG logging to functions Log KEYPAGE_FLAG in all log entries storage/maria/maria_def.h: Added SHARE->max_index_block_size Added MARIA_PAGE->org_size storage/maria/trnman.c: Reset flags for new transaction.
2010-09-06 01:25:44 +02:00
DBUG_ASSERT("one more hook to write" == 0);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
return 1;
}
if ((error= (*log_desc->record_execute_in_redo_phase)(rec)))
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Got error %d when executing record %s",
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
my_errno, log_desc->name);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
return error;
}
prototype_redo_exec_hook(LONG_TRANSACTION_ID)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
uint16 sid= rec->short_trid;
TrID long_trid= all_active_trans[sid].long_trid;
WL#3072 - Maria recovery. * fix for bitmap vs checkpoint bug which could lead to corrupted tables in case of crashes at certain moments: a bitmap could be flushed to disk even though it was inconsistent with the log (it could be flushed before REDO-UNDO are written to the log). One bug remains, need code from others. Tests added. Fix is to pin unflushable bitmap pages, and let checkpoint wait for them to be flushable. * fix for long_trid!=0 assertion failure at Recovery. * less useless wakeups in the background flush|checkpoint thread. * store global_trid_generator in checkpoint record. mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery.test: make it easier to locate subtests storage/maria/ma_bitmap.c: When we send a bitmap to the pagecache, if this bitmap is not in a flushable state we keep it pinned and add it to a list, it will be unpinned when the bitmap is flushable again. A new function _ma_bitmap_flush_all() used by checkpoint. A new function _ma_bitmap_flushable() used by block format to signal when it starts modifying a bitmap and when it is done with it. storage/maria/ma_blockrec.c: When starting a row operation (insert/update/delete), mark that the bitmap is not flushable (because for example INSERT is going to over-allocate in the bitmap to prevent other threads from using our data pages). If a checkpoint comes at this moment it will wait for the bitmap to be flushable before flushing it. When the operation ends, bitmap becomes flushable again; that transition is done under the bitmap's mutex (needed for correct synchro with a concurrent checkpoint); but for INSERT/UPDATE this happens inside _ma_bitmap_release_unused() at a place where it already has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO of INSERT. In case of errors after setting the bitmap unflushable, we must always set it back to flushable or checkpoint would block. Debug possibilities to force a sleep while the bitmap is over-allocated. In case of error in get_head_or_tail() in allocate_and_write_block_record(), we still need to unpin all pages. Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong data_file_length. storage/maria/ma_blockrec.h: new bitmap calls. storage/maria/ma_checkpoint.c: filter_flush_indirect not needed anymore (flushing bitmap pages happens in _ma_bitmap_flush_all() now). So st_filter_param::is_data_file|pages_covered_by_bitmap not needed. Other filter_flush* don't need to flush bitmap anymore. Add debug possibility to flush all bitmap pages outside of a checkpoint, to simulate pagecache LRU eviction. When the background flush/checkpoint thread notices it has nothing to flush, it now sleeps directly until the next potential checkpoint moment instead of waking up every second. When in checkpoint we decide to not store a table in the checkpoint record (because it has logged no writes for example), we can also skip flushing this table. storage/maria/ma_commit.c: comment is out-of-date storage/maria/ma_key_recover.c: comment fix storage/maria/ma_loghandler.c: comment is out-of-date storage/maria/ma_open.c: comment is out-of-date storage/maria/ma_pagecache.c: comment for bug to fix. And we don't take checkpoints at end of REDO phase yet so can trust block->type. storage/maria/ma_recovery.c: Comments. Now-unneeded code for incomplete REDO-UNDO groups removed. When we forget about an old transaction we must really forget about it with bzero() (fixes the "long_trid!=0 assertion" recovery bug). When we delete a row with maria_delete() we turn on STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both cases a row was deleted). Pick up max_long_trid from the checkpoint record. storage/maria/maria_chk.c: comment storage/maria/maria_def.h: MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and 'pinned_pages'. storage/maria/trnman.c: I used to think that recovery only needs to know the maximum TrID of the lists of active and committed transactions. But no, sometimes both lists can even be empty and their TrID should not be reused. So Checkpoint now saves global_trid_generator in the checkpoint record. storage/maria/trnman_public.h: macros to read/store a TrID mysql-test/r/maria-recovery-bitmap.result: result is ok. Without the code fix, we would get a corruption message about the bitmap page in CHECK TABLE EXTENDED. mysql-test/t/maria-recovery-bitmap-master.opt: usual when we crash mysqld in tests mysql-test/t/maria-recovery-bitmap.test: test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
/*
Any incomplete group should be of an old crash which already had a
recovery and thus has logged INCOMPLETE_GROUP which we must have seen.
*/
DBUG_ASSERT(all_active_trans[sid].group_start_lsn == LSN_IMPOSSIBLE);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (long_trid != 0)
{
LSN ulsn= all_active_trans[sid].undo_lsn;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
/*
If the first record of that transaction is after 'rec', it's probably
because that transaction was found in the checkpoint record, and then
it's ok, we can forget about that transaction (we'll meet it later
again in the REDO phase) and replace it with the one in 'rec'.
*/
if ((ulsn != LSN_IMPOSSIBLE) &&
(cmp_translog_addr(ulsn, rec->lsn) < 0))
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
char llbuf[22];
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
llstr(long_trid, llbuf);
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
eprint(tracef, "Found an old transaction long_trid %s short_trid %u"
" with same short id as this new transaction, and has neither"
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
" committed nor rollback (undo_lsn: (%lu,0x%lx))",
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
llbuf, sid, LSN_IN_PARTS(ulsn));
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto err;
}
}
long_trid= uint6korr(rec->header);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
new_transaction(sid, long_trid, LSN_IMPOSSIBLE, LSN_IMPOSSIBLE);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
err:
ALERT_USER();
return 1;
end:
return 0;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
static void new_transaction(uint16 sid, TrID long_id, LSN undo_lsn,
LSN first_undo_lsn)
{
char llbuf[22];
all_active_trans[sid].long_trid= long_id;
llstr(long_id, llbuf);
tprint(tracef, "Transaction long_trid %s short_trid %u starts,"
" undo_lsn (%lu,0x%lx) first_undo_lsn (%lu,0x%lx)\n",
llbuf, sid, LSN_IN_PARTS(undo_lsn), LSN_IN_PARTS(first_undo_lsn));
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_active_trans[sid].undo_lsn= undo_lsn;
all_active_trans[sid].first_undo_lsn= first_undo_lsn;
set_if_bigger(max_long_trid, long_id);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
prototype_redo_exec_hook_dummy(CHECKPOINT)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
/* the only checkpoint we care about was found via control file, ignore */
return 0;
}
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
prototype_redo_exec_hook_dummy(INCOMPLETE_GROUP)
{
/* abortion was already made */
return 0;
}
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
prototype_redo_exec_hook(INCOMPLETE_LOG)
{
MARIA_HA *info;
if (skip_DDLs)
{
tprint(tracef, "we skip DDLs\n");
return 0;
}
if ((info= get_MARIA_HA_from_REDO_record(rec)) == NULL)
{
/* no such table, don't need to warn */
return 0;
}
if (maria_is_crashed(info))
return 0;
if (info->s->state.is_of_horizon > rec->lsn)
{
/*
This table was repaired at a time after this log entry.
We can assume that all rows was inserted sucessfully and we don't
have to warn about that the inserted data was not logged
*/
return 0;
}
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
/*
Example of what can go wrong when replaying DDLs:
CREATE TABLE t (logged); INSERT INTO t VALUES(1) (logged);
ALTER TABLE t ... which does
CREATE a temporary table #sql... (logged)
INSERT data from t into #sql... (not logged)
RENAME #sql TO t (logged)
Removing tables by hand and replaying the log will leave in the
end an empty table "t": missing records. If after the RENAME an INSERT
into t was done, that row had number 1 in its page, executing the
REDO_INSERT_ROW_HEAD on the recreated empty t will fail (assertion
failure in _ma_apply_redo_insert_row_head_or_tail(): new data page is
created whereas rownr is not 0).
So when the server disables logging for ALTER TABLE or CREATE SELECT, it
logs LOGREC_INCOMPLETE_LOG to warn maria_read_log and then the user.
Another issue is that replaying of DDLs is not correct enough to work if
there was a crash during a DDL (see comment in execution of
REDO_RENAME_TABLE ).
*/
eprint(tracef, "***WARNING: Aria engine currently logs no records "
"about insertion of data by ALTER TABLE and CREATE SELECT, "
"as they are not necessary for recovery; "
"present applying of log records to table '%s' may well not work."
"***", info->s->index_file_name.str);
/* Prevent using the table for anything else than undo repair */
_ma_mark_file_crashed(info->s);
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
recovery_warnings++;
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
return 0;
}
prototype_redo_exec_hook(REDO_CREATE_TABLE)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
File dfile= -1, kfile= -1;
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
char *linkname_ptr, filename[FN_REFLEN], *name, *ptr, *ptr2,
*data_file_name, *index_file_name;
uchar *kfile_header;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
myf create_flag;
uint flags;
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
int error= 1, create_mode= O_RDWR | O_TRUNC, i;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
MARIA_HA *info= NULL;
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
uint kfile_size_before_extension, keystart;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (skip_DDLs)
{
tprint(tracef, "we skip DDLs\n");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
return 0;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
name= (char *)log_record_buffer.str;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
/*
TRUNCATE TABLE and REPAIR USE_FRM call maria_create(), so below we can
find a REDO_CREATE_TABLE for a table which we have open, that's why we
need to look for any open instances and close them first.
*/
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
if (close_one_table(name, rec->lsn))
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Table '%s' got error %d on close", name, my_errno);
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
ALERT_USER();
goto end;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* we try hard to get create_rename_lsn, to avoid mistakes if possible */
info= maria_open(name, O_RDONLY, HA_OPEN_FOR_REPAIR);
if (info)
{
MARIA_SHARE *share= info->s;
/* check that we're not already using it */
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (share->reopen != 1)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Table '%s is already open (reopen=%u)",
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
name, share->reopen);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
ALERT_USER();
goto end;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
DBUG_ASSERT(share->now_transactional == share->base.born_transactional);
if (!share->base.born_transactional)
{
/*
could be that transactional table was later dropped, and a non-trans
one was renamed to its name, thus create_rename_lsn is 0 and should
not be trusted.
*/
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
tprint(tracef, "Table '%s' is not transactional, ignoring creation\n",
name);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
ALERT_USER();
error= 0;
goto end;
}
if (cmp_translog_addr(share->state.create_rename_lsn, rec->lsn) >= 0)
{
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
tprint(tracef, "Table '%s' has create_rename_lsn (%lu,0x%lx) more "
"recent than record, ignoring creation",
name, LSN_IN_PARTS(share->state.create_rename_lsn));
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
error= 0;
goto end;
}
if (maria_is_crashed(info))
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Table '%s' is crashed, can't recreate it", name);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
ALERT_USER();
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
goto end;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
maria_close(info);
info= NULL;
}
Fix for LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery More DBUG_PRINT (to simplify future debugging) Aria: Added STATE_IN_REPAIR, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Aria: Some trivial speedup optimization Aria: Better warning if table was marked crashed by unfinnished repair mysql-test/lib/v1/mysql-test-run.pl: Fix so one can run RQG mysql-test/suite/maria/r/maria-recovery2.result: Update for new error message. mysys/stacktrace.c: Fixed compiler warning storage/maria/ha_maria.cc: More DBUG_PRINT Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Don't log query for dropping temporary table. storage/maria/ha_maria.h: Added prototype for drop_table() storage/maria/ma_blockrec.c: More DBUG_PRINT Make read_long_data() inline for most cases. (Trivial speedup optimization) storage/maria/ma_check.c: Better warning if table was marked crashed by unfinnished repair storage/maria/ma_open.c: More DBUG_PRINT storage/maria/ma_recovery.c: Give warning if found crashed table. Changed warning for tables that can't be opened. storage/maria/ma_recovery_util.c: Write warnings to DBUG file storage/maria/maria_chk.c: Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. storage/maria/maria_def.h: Added maria_mark_in_repair(x) storage/maria/maria_read_log.c: Added option: --character-sets-dir storage/maria/trnman.c: By default set min_read_from to max value. This allows us to remove TRN:s from rows during recovery to get more space. This fixes bug LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery
2010-07-30 09:45:27 +02:00
else
{
/* one or two files absent, or header corrupted... */
tprint(tracef, "Table '%s' can't be opened (Error: %d)\n",
name, my_errno);
}
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
/* if does not exist, or is older, overwrite it */
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
ptr= name + strlen(name) + 1;
if ((flags= ptr[0] ? HA_DONT_TOUCH_DATA : 0))
tprint(tracef, ", we will only touch index file");
ptr++;
kfile_size_before_extension= uint2korr(ptr);
ptr+= 2;
keystart= uint2korr(ptr);
ptr+= 2;
kfile_header= (uchar *)ptr;
ptr+= kfile_size_before_extension;
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
/* set header lsns */
ptr2= (char *) kfile_header + sizeof(info->s->state.header) +
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
MARIA_FILE_CREATE_RENAME_LSN_OFFSET;
for (i= 0; i<3; i++)
{
lsn_store(ptr2, rec->lsn);
ptr2+= LSN_STORE_SIZE;
}
data_file_name= ptr;
ptr+= strlen(data_file_name) + 1;
index_file_name= ptr;
ptr+= strlen(index_file_name) + 1;
/** @todo handle symlinks */
if (data_file_name[0] || index_file_name[0])
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Table '%s' DATA|INDEX DIRECTORY clauses are not handled",
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
name);
goto end;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
fn_format(filename, name, "", MARIA_NAME_IEXT,
(MY_UNPACK_FILENAME |
(flags & HA_DONT_TOUCH_DATA) ? MY_RETURN_REAL_PATH : 0) |
MY_APPEND_EXT);
linkname_ptr= NULL;
create_flag= MY_DELETE_OLD;
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
tprint(tracef, "Table '%s' creating as '%s'\n", name, filename);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if ((kfile= my_create_with_symlink(linkname_ptr, filename, 0, create_mode,
MYF(MY_WME|create_flag))) < 0)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to create index file");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
if (my_pwrite(kfile, kfile_header,
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
kfile_size_before_extension, 0, MYF(MY_NABP|MY_WME)) ||
my_chsize(kfile, keystart, 0, MYF(MY_WME)))
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to write to index file");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
if (!(flags & HA_DONT_TOUCH_DATA))
{
fn_format(filename,name,"", MARIA_NAME_DEXT,
MY_UNPACK_FILENAME | MY_APPEND_EXT);
linkname_ptr= NULL;
create_flag=MY_DELETE_OLD;
if (((dfile=
my_create_with_symlink(linkname_ptr, filename, 0, create_mode,
MYF(MY_WME | create_flag))) < 0) ||
my_close(dfile, MYF(MY_WME)))
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to create data file");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
/*
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
we now have an empty data file. To be able to
_ma_initialize_data_file() we need some pieces of the share to be
correctly filled. So we just open the table (fortunately, an empty
data file does not preclude this).
*/
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (((info= maria_open(name, O_RDONLY, 0)) == NULL) ||
_ma_initialize_data_file(info->s, info->dfile.file))
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to open new table or write to data file");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
}
error= 0;
end:
if (kfile >= 0)
error|= my_close(kfile, MYF(MY_WME));
if (info != NULL)
error|= maria_close(info);
return error;
}
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
prototype_redo_exec_hook(REDO_RENAME_TABLE)
{
char *old_name, *new_name;
int error= 1;
MARIA_HA *info= NULL;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (skip_DDLs)
{
tprint(tracef, "we skip DDLs\n");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
return 0;
}
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
goto end;
}
old_name= (char *)log_record_buffer.str;
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
new_name= old_name + strlen(old_name) + 1;
tprint(tracef, "Table '%s' to rename to '%s'; old-name table ", old_name,
new_name);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
/*
Here is why we skip CREATE/DROP/RENAME when doing a recovery from
ha_maria (whereas we do when called from maria_read_log). Consider:
CREATE TABLE t;
RENAME TABLE t to u;
DROP TABLE u;
RENAME TABLE v to u; # crash between index rename and data rename.
And do a Recovery (not removing tables beforehand).
Recovery replays CREATE, then RENAME: the maria_open("t") works,
maria_open("u") does not (no data file) so table "u" is considered
inexistent and so maria_rename() is done which overwrites u's index file,
which is lost. Ok, the data file (v.MAD) is still available, but only a
REPAIR USE_FRM can rebuild the index, which is unsafe and downtime.
So it is preferrable to not execute RENAME, and leave the "mess" of files,
rather than possibly destroy a file. DBA will manually rename files.
A safe recovery method would probably require checking the existence of
the index file and of the data file separately (not via maria_open()), and
maybe also to store a create_rename_lsn in the data file too
For now, all we risk is to leave the mess (half-renamed files) left by the
crash. We however sync files and directories at each file rename. The SQL
layer is anyway not crash-safe for DDLs (except the repartioning-related
ones).
We replay DDLs in maria_read_log to be able to recreate tables from
scratch. It means that "maria_read_log -a" should not be used on a
database which just crashed during a DDL. And also ALTER TABLE does not
log insertions of records into the temporary table, so replaying may
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
fail (grep for INCOMPLETE_LOG in files).
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
*/
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
info= maria_open(old_name, O_RDONLY, HA_OPEN_FOR_REPAIR);
if (info)
{
MARIA_SHARE *share= info->s;
if (!share->base.born_transactional)
{
tprint(tracef, ", is not transactional, ignoring renaming\n");
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
ALERT_USER();
error= 0;
goto end;
}
if (cmp_translog_addr(share->state.create_rename_lsn, rec->lsn) >= 0)
{
tprint(tracef, ", has create_rename_lsn (%lu,0x%lx) more recent than"
" record, ignoring renaming",
LSN_IN_PARTS(share->state.create_rename_lsn));
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
error= 0;
goto end;
}
if (maria_is_crashed(info))
{
tprint(tracef, ", is crashed, can't rename it");
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
ALERT_USER();
goto end;
}
Changed all file names in maria to LEX_STRING and removed some calls to strlen() Ensure that pagecache gives correct error number even if error for block happend mysys/my_pread.c: Indentation fix storage/maria/ha_maria.cc: filenames changed to be of type LEX_STRING storage/maria/ma_check.c: filenames changed to be of type LEX_STRING storage/maria/ma_checkpoint.c: filenames changed to be of type LEX_STRING storage/maria/ma_create.c: filenames changed to be of type LEX_STRING storage/maria/ma_dbug.c: filenames changed to be of type LEX_STRING storage/maria/ma_delete.c: filenames changed to be of type LEX_STRING storage/maria/ma_info.c: filenames changed to be of type LEX_STRING storage/maria/ma_keycache.c: filenames changed to be of type LEX_STRING storage/maria/ma_locking.c: filenames changed to be of type LEX_STRING storage/maria/ma_loghandler.c: filenames changed to be of type LEX_STRING storage/maria/ma_open.c: filenames changed to be of type LEX_STRING storage/maria/ma_pagecache.c: Store error number for last failed operation in the page block This should fix some asserts() when errno was not properly set after failure to read block in another thread storage/maria/ma_recovery.c: filenames changed to be of type LEX_STRING storage/maria/ma_update.c: filenames changed to be of type LEX_STRING storage/maria/ma_write.c: filenames changed to be of type LEX_STRING storage/maria/maria_def.h: filenames changed to be of type LEX_STRING storage/maria/maria_ftdump.c: filenames changed to be of type LEX_STRING storage/maria/maria_pack.c: filenames changed to be of type LEX_STRING
2008-08-25 13:49:47 +02:00
if (close_one_table(info->s->open_file_name.str, rec->lsn) ||
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
maria_close(info))
goto end;
info= NULL;
tprint(tracef, ", is ok for renaming; new-name table ");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
}
else /* one or two files absent, or header corrupted... */
{
tprint(tracef, ", can't be opened, probably does not exist");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
error= 0;
goto end;
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
}
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
/*
We must also check the create_rename_lsn of the 'new_name' table if it
exists: otherwise we may, with our rename which overwrites, destroy
another table. For example:
CREATE TABLE t;
RENAME t to u;
DROP TABLE u;
RENAME v to u; # v is an old table, its creation/insertions not in log
And start executing the log (without removing tables beforehand): creates
t, renames it to u (if not testing create_rename_lsn) thus overwriting
old-named v, drops u, and we are stuck, we have lost data.
*/
info= maria_open(new_name, O_RDONLY, HA_OPEN_FOR_REPAIR);
if (info)
{
MARIA_SHARE *share= info->s;
/* We should not have open instances on this table. */
if (share->reopen != 1)
{
tprint(tracef, ", is already open (reopen=%u)\n", share->reopen);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
ALERT_USER();
goto end;
}
if (!share->base.born_transactional)
{
tprint(tracef, ", is not transactional, ignoring renaming\n");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
ALERT_USER();
goto drop;
}
if (cmp_translog_addr(share->state.create_rename_lsn, rec->lsn) >= 0)
{
tprint(tracef, ", has create_rename_lsn (%lu,0x%lx) more recent than"
" record, ignoring renaming",
LSN_IN_PARTS(share->state.create_rename_lsn));
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
/*
We have to drop the old_name table. Consider:
CREATE TABLE t;
CREATE TABLE v;
RENAME TABLE t to u;
DROP TABLE u;
RENAME TABLE v to u;
and apply the log without removing tables beforehand. t will be
created, v too; in REDO_RENAME u will be more recent, but we still
have to drop t otherwise it stays.
*/
goto drop;
}
if (maria_is_crashed(info))
{
tprint(tracef, ", is crashed, can't rename it");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
ALERT_USER();
goto end;
}
if (maria_close(info))
goto end;
info= NULL;
/* abnormal situation */
tprint(tracef, ", exists but is older than record, can't rename it");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
goto end;
}
else /* one or two files absent, or header corrupted... */
tprint(tracef, ", can't be opened, probably does not exist");
tprint(tracef, ", renaming '%s'", old_name);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
if (maria_rename(old_name, new_name))
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to rename table");
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
goto end;
}
info= maria_open(new_name, O_RDONLY, 0);
if (info == NULL)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to open renamed table");
goto end;
}
Added versioning of Maria index Store max_trid in index file as state.create_trid. This is used to pack all transids in the index pages relative to max possible transid for file. Enable versioning for transactional tables with index. Tables with an auto-increment key, rtree or fulltext keys are not versioned. Changed info->lastkey to type MARIA_KEY. Removed info->lastkey_length as this is now part of info->lastkey Renamed old info->lastkey to info->lastkey_buff Use exact key lenghts for keys, not USE_WHOLE_KEY For partial key searches, use SEARCH_PART_KEY When searching to insert new key on page, use SEARCH_INSERT to mark that key has rowid Changes done in a lot of files: - Modified functions to use MARIA_KEY instead of key pointer and key length - Use keyinfo->root_lock instead of share->key_root_lock[keynr] - Simplify code by using local variable keyinfo instead if share->keyinfo[i] - Added #fdef EXTERNAL_LOCKING around removed state elements - HA_MAX_KEY_BUFF -> MARIA_MAX_KEY_BUFF (to reserve space for transid) - Changed type of 'nextflag' to uint32 to ensure all SEARCH_xxx flags fits into it .bzrignore: Added missing temporary directory extra/Makefile.am: comp_err is now deleted on make distclean include/maria.h: Added structure MARIA_KEY, which is used for intern key objects in Maria. Changed functions to take MARIA_KEY as an argument instead of pointer to packed key. Changed some functions that always return true or false to my_bool. Added virtual function make_key() to avoid if in _ma_make_key() Moved rw_lock_t for locking trees from share->key_root_lock to MARIA_KEYDEF. This makes usage of the locks simpler and faster include/my_base.h: Added HA_RTREE_INDEX flag to mark rtree index. Used for easier checks in ma_check() Added SEARCH_INSERT to be used when inserting new keys Added SEARCH_PART_KEY for partial searches Added SEARCH_USER_KEY_HAS_TRANSID to be used when key we use for searching in btree has a TRANSID Added SEARCH_PAGE_KEY_HAS_TRANSID to be used when key we found in btree has a transid include/my_handler.h: Make next_flag 32 bit to make sure we can handle all SEARCH_ bits mysql-test/include/maria_empty_logs.inc: Read and restore current database; Don't assume we are using mysqltest. Don't log use databasename to log. Using this include should not cause any result changes. mysql-test/r/maria-gis-rtree-dynamic.result: Updated results after adding some check table commands to help pinpoint errors mysql-test/r/maria-mvcc.result: New tests mysql-test/r/maria-purge.result: New result after adding removal of logs mysql-test/r/maria-recovery-big.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria-recovery-bitmap.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria-recovery-rtree-ft.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria-recovery.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria.result: New tests mysql-test/r/variables-big.result: Don't log id as it's not predictable mysql-test/suite/rpl_ndb/r/rpl_truncate_7ndb_2.result: Updated results to new binlog results. (Test has not been run in a long time as it requires --big) mysql-test/suite/rpl_ndb/t/rpl_truncate_7ndb_2-master.opt: Moved file to ndb replication test directory mysql-test/suite/rpl_ndb/t/rpl_truncate_7ndb_2.test: Fixed wrong path to included tests mysql-test/t/maria-gis-rtree-dynamic.test: Added some check table commands to help pinpoint errors mysql-test/t/maria-mvcc.test: New tests mysql-test/t/maria-purge.test: Remove logs to make test results predictable mysql-test/t/maria.test: New tests for some possible problems mysql-test/t/variables-big.test: Don't log id as it's not predictable mysys/my_handler.c: Updated function comment to reflect old code Changed nextflag to be uint32 to ensure we can have flags > 16 bit Changed checking if we are in insert with NULL keys as next_flag can now include additional bits that have to be ignored. Added SEARCH_INSERT flag to be used when inserting new keys in btree. This flag tells us the that the keys includes row position and it's thus safe to remove SEARCH_FIND Added comparision of transid. This is only done if the keys actually have a transid, which is indicated by nextflag mysys/my_lock.c: Fixed wrong test (Found by Guilhem) scripts/Makefile.am: Ensure that test programs are deleted by make clean sql/rpl_rli.cc: Moved assignment order to fix compiler warning storage/heap/hp_write.c: Add SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys storage/maria/Makefile.am: Remove also maria log files when doing make distclean storage/maria/ha_maria.cc: Use 'file->start_state' as default state for transactional tables without versioning At table unlock, set file->state to point to live state. (Needed for information schema to pick up right number of rows) In ha_maria::implicit_commit() move all locked (ie open) tables to new transaction. This is needed to ensure ha_maria->info doesn't point to a deleted history event. Disable concurrent inserts for insert ... select and table changes with subqueries if statement based replication as this would cause wrong results on slave storage/maria/ma_blockrec.c: Updated comment storage/maria/ma_check.c: Compact key pages (removes transid) when doing --zerofill Check that 'page_flag' on key pages contains KEYPAGE_FLAG_HAS_TRANSID if there is a single key on the page with a transid Modified functions to use MARIA_KEY instead of key pointer and key length Use new interface to _ma_rec_pos(), _ma_dpointer(), _ma_ft_del(), ma_update_state_lsn() Removed not needed argument from get_record_for_key() Fixed that we check doesn't give errors for RTREE; We now treath these like SPATIAL Remove some SPATIAL specific code where the virtual functions can handle this in a general manner Use info->lastkey_buff instead of info->lastkey _ma_dpos() -> _ma_row_pos_from_key() _ma_make_key() -> keyinfo->make_key() _ma_print_key() -> _ma_print_keydata() _ma_move_key() -> ma_copy_copy() Add SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys Ensure that data on page doesn't overwrite page checksum position Use DBUG_DUMP_KEY instead of DBUG_DUMP Use exact key lengths instead of USE_WHOLE_KEY to ha_key_cmp() Fixed check if rowid points outside of BLOCK_RECORD data file Use info->lastkey_buff instead of key on stack in some safe places Added #fdef EXTERNAL_LOCKING around removed state elements storage/maria/ma_close.c: Use keyinfo->root_lock instead of share->key_root_lock[keynr] storage/maria/ma_create.c: Removed assert that is already checked in maria_init() Force transactinal tables to be of type BLOCK_RECORD Fixed wrong usage of HA_PACK_RECORD (should be HA_OPTION_PACK_RECORD) Mark keys that uses HA_KEY_ALG_RTREE with HA_RTREE_INDEX for easier handling of these in ma_check Store max_trid in index file as state.create_trid. This is used to pack all transids in the index pages relative to max possible transid for file. storage/maria/ma_dbug.c: Changed _ma_print_key() to use MARIA_KEY storage/maria/ma_delete.c: Modified functions to use MARIA_KEY instead of key pointer and key length info->lastkey2-> info->lastkey_buff2 Added SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys Use new interface for get_key(), _ma_get_last_key() and others _ma_dpos() -> ma_row_pos_from_key() Simplify setting of prev_key in del() Ensure that KEYPAGE_FLAG_HAS_TRANSID is set in page_flag if key page has transid Treath key pages that may have a transid as if keys would be of variable length storage/maria/ma_delete_all.c: Reset history state if maria_delete_all_rows() are called Update parameters to _ma_update_state_lsns() call storage/maria/ma_extra.c: Store and restore info->lastkey storage/maria/ma_ft_boolean_search.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_ft_nlq_search.c: Modified functions to use MARIA_KEY instead of key pointer and key length Use lastkey_buff2 instead of info->lastkey+info->s->base.max_key_length (same thing) storage/maria/ma_ft_update.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_ftdefs.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_fulltext.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_init.c: Check if blocksize is legal (Moved test here from ma_open()) storage/maria/ma_key.c: Added functions for storing/reading of transid Modified functions to use MARIA_KEY instead of key pointer and key length Moved _ma_sp_make_key() out of _ma_make_key() as we now use keyinfo->make_key to create keys Add transid to keys if table is versioned Added _ma_copy_key() storage/maria/ma_key_recover.c: Add logging of page_flag (holds information if there are keys with transid on page) Changed DBUG_PRINT("info" -> DBUG_PRINT("redo" as the redo logging can be quite extensive Added lots of DBUG_PRINT() Added support for index page operations: KEY_OP_SET_PAGEFLAG and KEY_OP_COMPACT_PAGE storage/maria/ma_key_recover.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_locking.c: Added new arguments to _ma_update_state_lsns_sub() storage/maria/ma_loghandler.c: Fixed all logging of LSN to look similar in DBUG log Changed if (left != 0) to if (left) as the later is used also later in the code storage/maria/ma_loghandler.h: Added new index page operations storage/maria/ma_open.c: Removed allocated "state_dummy" and instead use share->state.common for transactional tables that are not versioned This is needed to not get double increments of state.records (one in ma_write.c and on when log is written) Changed info->lastkey to MARIA_KEY type Removed resetting of MARIA_HA variables that have 0 as default value (as info is zerofilled) Enable versioning for transactional tables with index. Tables with an auto-increment key, rtree or fulltext keys are not versioned. Check on open that state.create_trid is correct Extend share->base.max_key_length in case of transactional table so that it can hold transid Removed 4.0 compatible fulltext key mode as this is not relevant for Maria Removed old and wrong #ifdef ENABLE_WHEN_WE_HAVE_TRANS_ROW_ID code block Initialize all new virtual function pointers Removed storing of state->unique, state->process and store state->create_trid instead storage/maria/ma_page.c: Added comment to describe key page structure Added functions to compact key page and log the compact operation storage/maria/ma_range.c: Modified functions to use MARIA_KEY instead of key pointer and key length Use SEARCH_PART_KEY indicator instead of USE_WHOLE_KEY to detect if we are doing a part key search Added handling of pages with transid storage/maria/ma_recovery.c: Don't assert if table we opened are not transactional. This may be a table which has been changed from transactional to not transactinal Added new arguments to _ma_update_state_lsns() storage/maria/ma_rename.c: Added new arguments to _ma_update_state_lsns() storage/maria/ma_rkey.c: Modified functions to use MARIA_KEY instead of key pointer and key length Don't use USE_WHOLE_KEY, use real length of key Use share->row_is_visible() to test if row is visible Moved search_flag == HA_READ_KEY_EXACT out of 'read-next-row' loop as this only need to be tested once Removed test if last_used_keyseg != 0 as this is always true storage/maria/ma_rnext.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify code by using local variable keyinfo instead if share->keyinfo[i] Use share->row_is_visible() to test if row is visible storage/maria/ma_rnext_same.c: Modified functions to use MARIA_KEY instead of key pointer and key length lastkey2 -> lastkey_buff2 storage/maria/ma_rprev.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify code by using local variable keyinfo instead if share->keyinfo[i] Use share->row_is_visible() to test if row is visible storage/maria/ma_rsame.c: Updated comment Simplify code by using local variable keyinfo instead if share->keyinfo[i] Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rsamepos.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rt_index.c: Modified functions to use MARIA_KEY instead of key pointer and key length Use better variable names Removed not needed casts _ma_dpos() -> _ma_row_pos_from_key() Use info->last_rtree_keypos to save position to key instead of info->int_keypos Simplify err: condition Changed return type for maria_rtree_insert() to my_bool as we are only intressed in ok/fail from this function storage/maria/ma_rt_index.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rt_key.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify maria_rtree_add_key by combining idenitcal code and removing added_len storage/maria/ma_rt_key.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rt_mbr.c: Changed type of 'nextflag' to uint32 Added 'to' argument to RT_PAGE_MBR_XXX functions to more clearly see which variables changes value storage/maria/ma_rt_mbr.h: Changed type of 'nextflag' to uint32 storage/maria/ma_rt_split.c: Modified functions to use MARIA_KEY instead of key pointer and key length key_length -> key_data_length to catch possible errors storage/maria/ma_rt_test.c: Fixed wrong comment Reset recinfo to avoid valgrind varnings Fixed wrong argument to create_record() that caused test to fail storage/maria/ma_search.c: Modified functions to use MARIA_KEY instead of key pointer and key length Added support of keys with optional trid Test for SEARCH_PART_KEY instead of USE_WHOLE_KEY to detect part key reads _ma_dpos() -> _ma_row_pos_from_key() If there may be keys with transid on the page, have _ma_bin_search() call _ma_seq_search() Add _ma_skip_xxx() functions to quickly step over keys (faster than calling get_key() in most cases as we don't have to copy key data) Combine similar code at end of _ma_get_binary_pack_key() Removed not used function _ma_move_key() In _ma_search_next() don't call _ma_search() if we aren't on a nod page. Update info->cur_row.trid with trid for found key Removed some not needed casts Added _ma_trid_from_key() Use MARIA_SHARE instead of MARIA_HA as arguments to _ma_rec_pos(), _ma_dpointer() and _ma_xxx_keypos_to_recpos() to make functions faster and smaller storage/maria/ma_sort.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_sp_defs.h: _ma_sp_make_key() now fills in and returns (MARIA_KEY *) value storage/maria/ma_sp_key.c: _ma_sp_make_key() now fills in and returns (MARIA_KEY *) value Don't test sizeof(double), test against 8 as we are using float8store() Use mi_float8store() instead of doing swap of value (same thing but faster) storage/maria/ma_state.c: maria_versioning() now only calls _ma_block_get_status() if table supports versioning Added _ma_row_visible_xxx() functions for different occasions When emptying history, set info->state to point to the first history event. storage/maria/ma_state.h: Added _ma_row_visible_xxx() prototypes storage/maria/ma_static.c: Indentation changes storage/maria/ma_statrec.c: Fixed arguments to _ma_dpointer() and _ma_rec_pos() storage/maria/ma_test1.c: Call init_thr_lock() if we have versioning storage/maria/ma_test2.c: Call init_thr_lock() if we have versioning storage/maria/ma_unique.c: Modified functions to use MARIA_KEY storage/maria/ma_update.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_write.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify code by using local variable keyinfo instead if share->keyinfo[i] In _ma_enlarge_root(), mark in page_flag if new key has transid _ma_dpos() -> _ma_row_pos_from_key() Changed return type of _ma_ck_write_tree() to my_bool as we are only testing if result is true or not Moved 'reversed' to outside block as area was used later storage/maria/maria_chk.c: Added error if trying to sort with HA_BINARY_PACK_KEY Use new interface to get_key() and _ma_dpointer() _ma_dpos() -> _ma_row_pos_from_key() storage/maria/maria_def.h: Modified functions to use MARIA_KEY instead of key pointer and key length Added 'common' to MARIA_SHARE->state for storing state for transactional tables without versioning Added create_trid to MARIA_SHARE Removed not used state variables 'process' and 'unique' Added defines for handling TRID's in index pages Changed to use MARIA_SHARE instead of MARIA_HA for some functions Added 'have_versioning' flag if table supports versioning Moved key_root_lock from MARIA_SHARE to MARIA_KEYDEF Changed last_key to be of type MARIA_KEY. Removed lastkey_length lastkey -> lastkey_buff, lastkey2 -> lastkey_buff2 Added _ma_get_used_and_nod_with_flag() for faster access to page data when page_flag is read Added DBUG_DUMP_KEY for easier DBUG_DUMP of a key Changed 'nextflag' and assocaited variables to uint32 storage/maria/maria_ftdump.c: lastkey -> lastkey_buff storage/maria/trnman.c: Fixed wrong initialization of min_read_from and max_commit_trid Added trnman_get_min_safe_trid() storage/maria/unittest/ma_test_all-t: Added --start-from storage/myisam/mi_check.c: Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order storage/myisam/mi_delete.c: Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order storage/myisam/mi_range.c: Updated comment storage/myisam/mi_write.c: Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order storage/myisam/rt_index.c: Fixed wrong parameter to rtree_get_req() which could cause crash
2008-06-26 07:18:28 +02:00
if (_ma_update_state_lsns(info->s, rec->lsn, info->s->state.create_trid,
TRUE, TRUE))
goto end;
if (maria_close(info))
goto end;
info= NULL;
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
error= 0;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
goto end;
drop:
tprint(tracef, ", only dropping '%s'", old_name);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (maria_delete_table(old_name))
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to drop table");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
goto end;
}
error= 0;
goto end;
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
end:
tprint(tracef, "\n");
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
if (info != NULL)
error|= maria_close(info);
return error;
}
/*
The record may come from REPAIR, ALTER TABLE ENABLE KEYS, OPTIMIZE.
*/
prototype_redo_exec_hook(REDO_REPAIR_TABLE)
{
int error= 1;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
MARIA_HA *info;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
HA_CHECK param;
char *name;
Added --with-maria-tmp-tables (default one) to allow on to configure if Maria should be used for internal temporary tables Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables Fixed bug that caused update of big blobs to crash Use pagecache_page_no_t as type for pages (to get rid of compiler warnings) Added cast to get rid of compiler warning Fixed wrong types of variables and arguments that caused lost information Fixed wrong DBUG_ASSERT() that caused REDO of big blobs to fail Removed some historical ifdefs that caused problem with windows compilations BUILD/SETUP.sh: Added --with-maria-tmp-tables include/maria.h: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option include/my_base.h: Added comment mysql-test/r/maria-big.result: Added test that uses big blobs mysql-test/t/maria-big.test: Added test that uses big blobs sql/mysqld.cc: Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables sql/sql_class.h: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined sql/sql_select.cc: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined storage/maria/ha_maria.cc: Fixed compiler warnings reported by MCC - Fixed usage of wrong types that caused data loss - Changed parameter for rep_quick to my_bool - Added safe casts Fixed indentation storage/maria/ma_bitmap.c: Use pagecache_page_no_t as type for pages Fixed compiler warnings Fixed bug that caused update of big blobs to crash storage/maria/ma_blockrec.c: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option Fixed compiler warnings Fixed wrong DBUG_ASSERT() storage/maria/ma_blockrec.h: Use pagecache_page_no_t as type for pages storage/maria/ma_check.c: Fixed some wrong parameters where we didn't get all bits for test_flag Changed rep_quick to be of type my_bool Use pagecache_page_no_t as type for pages Added cast's to get rid of compiler warnings Changed type of record_pos to get rid of compiler warning storage/maria/ma_create.c: Added safe cast's to get rid of compiler warnings storage/maria/ma_dynrec.c: Fixed usage of wrong type storage/maria/ma_key.c: Fixed compiler warning storage/maria/ma_key_recover.c: Use pagecache_page_no_t as type for pages storage/maria/ma_loghandler_lsn.h: Added cast's to get rid of compiler warnings storage/maria/ma_page.c: Changed variable name from 'page' to 'pos' as it was an offset and not a page address Moved page_size inside block to get rid of compiler warning storage/maria/ma_pagecache.c: Fixed compiler warnings Replaced compile time assert with TODO storage/maria/ma_pagecache.h: Use pagecache_page_no_t as type for pages storage/maria/ma_pagecrc.c: Allow bitmap pages that is all zero storage/maria/ma_preload.c: Added cast to get rid of compiler warning storage/maria/ma_recovery.c: Changed types to get rid of compiler warnings Use bool for quick_repair to get rid of compiler warning Fixed some variables that was wrongly declared (not enough precission) Added cast to get rid of compiler warning storage/maria/ma_test2.c: Remove historical undefs storage/maria/maria_chk.c: Changed rep_quick to bool Fixed wrong parameter to maria_chk_data_link() storage/maria/maria_def.h: Use pagecache_page_no_t as type for pages storage/maria/maria_pack.c: Renamed isam -> maria storage/maria/plug.in: Added option --with-maria-tmp-tables storage/maria/trnman.c: Added cast to get rid of compiler warning storage/myisam/mi_test2.c: Remove historical undefs
2008-01-10 20:21:36 +01:00
my_bool quick_repair;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_ENTER("exec_REDO_LOGREC_REDO_REPAIR_TABLE");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (skip_DDLs)
{
/*
REPAIR is not exactly a DDL, but it manipulates files without logging
insertions into them.
*/
tprint(tracef, "we skip DDLs\n");
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_RETURN(0);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
}
if ((info= get_MARIA_HA_from_REDO_record(rec)) == NULL)
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_RETURN(0);
if (maria_is_crashed(info))
{
tprint(tracef, "we skip repairing crashed table\n");
DBUG_RETURN(0);
}
/*
Otherwise, the mapping is newer than the table, and our record is newer
than the mapping, so we can repair.
*/
tprint(tracef, " repairing...\n");
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
maria_chk_init(&param);
Changed all file names in maria to LEX_STRING and removed some calls to strlen() Ensure that pagecache gives correct error number even if error for block happend mysys/my_pread.c: Indentation fix storage/maria/ha_maria.cc: filenames changed to be of type LEX_STRING storage/maria/ma_check.c: filenames changed to be of type LEX_STRING storage/maria/ma_checkpoint.c: filenames changed to be of type LEX_STRING storage/maria/ma_create.c: filenames changed to be of type LEX_STRING storage/maria/ma_dbug.c: filenames changed to be of type LEX_STRING storage/maria/ma_delete.c: filenames changed to be of type LEX_STRING storage/maria/ma_info.c: filenames changed to be of type LEX_STRING storage/maria/ma_keycache.c: filenames changed to be of type LEX_STRING storage/maria/ma_locking.c: filenames changed to be of type LEX_STRING storage/maria/ma_loghandler.c: filenames changed to be of type LEX_STRING storage/maria/ma_open.c: filenames changed to be of type LEX_STRING storage/maria/ma_pagecache.c: Store error number for last failed operation in the page block This should fix some asserts() when errno was not properly set after failure to read block in another thread storage/maria/ma_recovery.c: filenames changed to be of type LEX_STRING storage/maria/ma_update.c: filenames changed to be of type LEX_STRING storage/maria/ma_write.c: filenames changed to be of type LEX_STRING storage/maria/maria_def.h: filenames changed to be of type LEX_STRING storage/maria/maria_ftdump.c: filenames changed to be of type LEX_STRING storage/maria/maria_pack.c: filenames changed to be of type LEX_STRING
2008-08-25 13:49:47 +02:00
param.isam_file_name= name= info->s->open_file_name.str;
Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type Added --zerofill (-z) option to maria_chk (Mostly code from Jani) Added now table states ZEROFILLED and MOVEABLE Set state.changed with new states when things changes include/maria.h: Added maria_zerofill include/myisamchk.h: Added option for zerofill Extend testflag to be 64 to allow for more flags mysql-test/r/create.result: Updated results after merge mysql-test/r/maria.result: Updated results after merge mysys/my_getopt.c: Removed not used variable sql/sql_show.cc: Fixed wrong page type storage/maria/ma_blockrec.c: Renamed compact_page() to ma_compact_block_page() and made it global Always zerofill half filled blob pages Set share.state.changed on REDO storage/maria/ma_blockrec.h: Added _ma_compact_block_page() storage/maria/ma_check.c: Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type This gives the following benefits: - Table is smaller if compressed - All LSN are removed for transactinal tables and this makes them movable between systems Dont set table states of we are using --quick Changed log entry for repair to use 8 bytes for flag storage/maria/ma_delete.c: Simplify code Update state.changed storage/maria/ma_key_recover.c: Update state.changed storage/maria/ma_locking.c: Set uuid for file on first change if it's not set (table was cleared with zerofill) storage/maria/ma_loghandler.c: Updated log entry for REDO_REPAIR_TABLE storage/maria/ma_recovery.c: Updated log entry for REDO_REPAIR_TABLE (flag is now 8 bytes) Set new bits in state.changed storage/maria/ma_test_all.sh: Nicer output storage/maria/ma_test_recovery.expected: Updated results (now states flags are visible) storage/maria/ma_update.c: Update state.changed storage/maria/ma_write.c: Simplify code Set state.changed storage/maria/maria_chk.c: Added option --zerofill Added printing of states for MOVABLE and ZEROFILLED MYD -> MAD MYI -> MAI storage/maria/maria_def.h: Added states STATE_NOT_MOVABLE and STATE_NOT_ZEROFILLED Added prototype for new functions storage/maria/unittest/ma_test_all-t: More tests, including tests for zerofill Removed some not needed 'print' statements storage/maria/unittest/ma_test_loghandler_multithread-t.c: Smaller buffer to not trash devlopment machines totally
2007-12-31 10:55:46 +01:00
param.testflag= uint8korr(rec->header + FILEID_STORE_SIZE);
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
param.tmpdir= maria_tmpdir;
Fixed bug that 'maria_read_log -a' didn't set max_trid when reparing tables. Fixed bug in Aria when replacing short keys with long keys and a key tree both overflow and underflow at same time. Fixed several bugs when generating recovery logs when using RGQ with replacing long keys with short keys and vice versa. Lots of new DBUG_ASSERT()'s Added more information to recovery log to make it easier to know from where log entry orginated. Introduced MARIA_PAGE->org_size that tells what the size of the page was in last log entry. This allows us to find out if all key changes for index page was logged. Small code cleanups: - Introduced _ma_log_key_changes() to log crc of key page changes - Added share->max_index_block_size as max size of data one can put in key block (block_size - KEYPAGE_CHECKSUM_SIZE) This will later simplify adding a directory to index pages. - Write page number instead of page postition to DBUG log mysql-test/lib/v1/mysql-test-run.pl: Use --general-log instead of --log to disable warning when using RQG sql/mysqld.cc: If we have already sent ok to client when we get an error, log this to stderr Don't disable option --log-output if CSV engine is not supported. storage/maria/ha_maria.cc: Log queries to recovery log also in LOCK TABLES storage/maria/ma_check.c: If param->max_trid is set, use this value instead of max_trid_in_system(). This is used by recovery to set max_trid to max seen trid so far. keyinfo->block_length - KEYPAGE_CHECKSUM_SIZE -> max_index_block_size (Style optimization) storage/maria/ma_delete.c: Mark tables crashed early Write page number instead of page position to debug log. Added parameter to ma_log_delete() and ma_log_prefix() that is logged so that we can find where wrong log entries where generated. Fixed bug where a page was not proplerly written when same key tree had both an overflow and underflow when deleting a key. keyinfo->block_length - KEYPAGE_CHECKSUM_SIZE => max_index_block_size (Style optimization) ma_log_delete() now has extra parameter of how many bytes from end of page should be appended to log for page (for page overflows) storage/maria/ma_key_recover.c: Added extra parameter to ma_log_prefix() to indicate what caused log entry. Update MARIA_PAGE->org_size when logging info about page. Much more DBUG_ASSERT()'s. Fix some bugs in maria_log_add() to handle page overflows. Added _ma_log_key_changes() to log crc of key page changes. If EXTRA_STORE_FULL_PAGE_IN_KEY_CHANGES is defines, log the resulting pages to log so one can trivally see how the resulting page should have looked like (for errors in CRC values) storage/maria/ma_key_recover.h: Added _ma_log_key_changes() which is only called if EXTRA_DEBUG_KEY_CHANGES is defined. Updated function prototypes. storage/maria/ma_loghandler.h: Added more values to en_key_debug, to get more exact location where things went wrong when logging to recovery log. storage/maria/ma_open.c: Initialize share->max_index_block_size storage/maria/ma_page.c: Added updating and testing of MARIA_PAGE->org_size Write page number instead of page postition to DBUG log Generate error if we read page with wrong data. Removed wrong assert: key_del_current != share->state.key_del. Simplify _ma_log_compact_keypage() storage/maria/ma_recovery.c: Set param.max_trid to max seen trid before running repair table (used for alter table to create index) storage/maria/ma_rt_key.c: Update call to _ma_log_delete() storage/maria/ma_rt_split.c: Use _ma_log_key_changes() Update MARIA_PAGE->org_size storage/maria/ma_unique.c: Remove casts storage/maria/ma_write.c: keyinfo->block_length - KEYPAGE_CHECKSUM_SIZE => share->max_index_block_length. Updated calls to _ma_log_prefix() Changed code to use _ma_log_key_changes() Update ma_page->org_size Fixed bug in _ma_log_split() for pages that overflow Added KEY_OP_DEBUG logging to functions Log KEYPAGE_FLAG in all log entries storage/maria/maria_def.h: Added SHARE->max_index_block_size Added MARIA_PAGE->org_size storage/maria/trnman.c: Reset flags for new transaction.
2010-09-06 01:25:44 +02:00
param.max_trid= max_long_trid;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_ASSERT(maria_tmpdir);
Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type Added --zerofill (-z) option to maria_chk (Mostly code from Jani) Added now table states ZEROFILLED and MOVEABLE Set state.changed with new states when things changes include/maria.h: Added maria_zerofill include/myisamchk.h: Added option for zerofill Extend testflag to be 64 to allow for more flags mysql-test/r/create.result: Updated results after merge mysql-test/r/maria.result: Updated results after merge mysys/my_getopt.c: Removed not used variable sql/sql_show.cc: Fixed wrong page type storage/maria/ma_blockrec.c: Renamed compact_page() to ma_compact_block_page() and made it global Always zerofill half filled blob pages Set share.state.changed on REDO storage/maria/ma_blockrec.h: Added _ma_compact_block_page() storage/maria/ma_check.c: Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type This gives the following benefits: - Table is smaller if compressed - All LSN are removed for transactinal tables and this makes them movable between systems Dont set table states of we are using --quick Changed log entry for repair to use 8 bytes for flag storage/maria/ma_delete.c: Simplify code Update state.changed storage/maria/ma_key_recover.c: Update state.changed storage/maria/ma_locking.c: Set uuid for file on first change if it's not set (table was cleared with zerofill) storage/maria/ma_loghandler.c: Updated log entry for REDO_REPAIR_TABLE storage/maria/ma_recovery.c: Updated log entry for REDO_REPAIR_TABLE (flag is now 8 bytes) Set new bits in state.changed storage/maria/ma_test_all.sh: Nicer output storage/maria/ma_test_recovery.expected: Updated results (now states flags are visible) storage/maria/ma_update.c: Update state.changed storage/maria/ma_write.c: Simplify code Set state.changed storage/maria/maria_chk.c: Added option --zerofill Added printing of states for MOVABLE and ZEROFILLED MYD -> MAD MYI -> MAI storage/maria/maria_def.h: Added states STATE_NOT_MOVABLE and STATE_NOT_ZEROFILLED Added prototype for new functions storage/maria/unittest/ma_test_all-t: More tests, including tests for zerofill Removed some not needed 'print' statements storage/maria/unittest/ma_test_loghandler_multithread-t.c: Smaller buffer to not trash devlopment machines totally
2007-12-31 10:55:46 +01:00
info->s->state.key_map= uint8korr(rec->header + FILEID_STORE_SIZE + 8);
Added --with-maria-tmp-tables (default one) to allow on to configure if Maria should be used for internal temporary tables Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables Fixed bug that caused update of big blobs to crash Use pagecache_page_no_t as type for pages (to get rid of compiler warnings) Added cast to get rid of compiler warning Fixed wrong types of variables and arguments that caused lost information Fixed wrong DBUG_ASSERT() that caused REDO of big blobs to fail Removed some historical ifdefs that caused problem with windows compilations BUILD/SETUP.sh: Added --with-maria-tmp-tables include/maria.h: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option include/my_base.h: Added comment mysql-test/r/maria-big.result: Added test that uses big blobs mysql-test/t/maria-big.test: Added test that uses big blobs sql/mysqld.cc: Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables sql/sql_class.h: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined sql/sql_select.cc: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined storage/maria/ha_maria.cc: Fixed compiler warnings reported by MCC - Fixed usage of wrong types that caused data loss - Changed parameter for rep_quick to my_bool - Added safe casts Fixed indentation storage/maria/ma_bitmap.c: Use pagecache_page_no_t as type for pages Fixed compiler warnings Fixed bug that caused update of big blobs to crash storage/maria/ma_blockrec.c: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option Fixed compiler warnings Fixed wrong DBUG_ASSERT() storage/maria/ma_blockrec.h: Use pagecache_page_no_t as type for pages storage/maria/ma_check.c: Fixed some wrong parameters where we didn't get all bits for test_flag Changed rep_quick to be of type my_bool Use pagecache_page_no_t as type for pages Added cast's to get rid of compiler warnings Changed type of record_pos to get rid of compiler warning storage/maria/ma_create.c: Added safe cast's to get rid of compiler warnings storage/maria/ma_dynrec.c: Fixed usage of wrong type storage/maria/ma_key.c: Fixed compiler warning storage/maria/ma_key_recover.c: Use pagecache_page_no_t as type for pages storage/maria/ma_loghandler_lsn.h: Added cast's to get rid of compiler warnings storage/maria/ma_page.c: Changed variable name from 'page' to 'pos' as it was an offset and not a page address Moved page_size inside block to get rid of compiler warning storage/maria/ma_pagecache.c: Fixed compiler warnings Replaced compile time assert with TODO storage/maria/ma_pagecache.h: Use pagecache_page_no_t as type for pages storage/maria/ma_pagecrc.c: Allow bitmap pages that is all zero storage/maria/ma_preload.c: Added cast to get rid of compiler warning storage/maria/ma_recovery.c: Changed types to get rid of compiler warnings Use bool for quick_repair to get rid of compiler warning Fixed some variables that was wrongly declared (not enough precission) Added cast to get rid of compiler warning storage/maria/ma_test2.c: Remove historical undefs storage/maria/maria_chk.c: Changed rep_quick to bool Fixed wrong parameter to maria_chk_data_link() storage/maria/maria_def.h: Use pagecache_page_no_t as type for pages storage/maria/maria_pack.c: Renamed isam -> maria storage/maria/plug.in: Added option --with-maria-tmp-tables storage/maria/trnman.c: Added cast to get rid of compiler warning storage/myisam/mi_test2.c: Remove historical undefs
2008-01-10 20:21:36 +01:00
quick_repair= test(param.testflag & T_QUICK);
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
if (param.testflag & T_REP_PARALLEL)
{
if (maria_repair_parallel(&param, info, name, quick_repair))
goto end;
}
else if (param.testflag & T_REP_BY_SORT)
{
if (maria_repair_by_sort(&param, info, name, quick_repair))
goto end;
}
else if (maria_repair(&param, info, name, quick_repair))
goto end;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
Added versioning of Maria index Store max_trid in index file as state.create_trid. This is used to pack all transids in the index pages relative to max possible transid for file. Enable versioning for transactional tables with index. Tables with an auto-increment key, rtree or fulltext keys are not versioned. Changed info->lastkey to type MARIA_KEY. Removed info->lastkey_length as this is now part of info->lastkey Renamed old info->lastkey to info->lastkey_buff Use exact key lenghts for keys, not USE_WHOLE_KEY For partial key searches, use SEARCH_PART_KEY When searching to insert new key on page, use SEARCH_INSERT to mark that key has rowid Changes done in a lot of files: - Modified functions to use MARIA_KEY instead of key pointer and key length - Use keyinfo->root_lock instead of share->key_root_lock[keynr] - Simplify code by using local variable keyinfo instead if share->keyinfo[i] - Added #fdef EXTERNAL_LOCKING around removed state elements - HA_MAX_KEY_BUFF -> MARIA_MAX_KEY_BUFF (to reserve space for transid) - Changed type of 'nextflag' to uint32 to ensure all SEARCH_xxx flags fits into it .bzrignore: Added missing temporary directory extra/Makefile.am: comp_err is now deleted on make distclean include/maria.h: Added structure MARIA_KEY, which is used for intern key objects in Maria. Changed functions to take MARIA_KEY as an argument instead of pointer to packed key. Changed some functions that always return true or false to my_bool. Added virtual function make_key() to avoid if in _ma_make_key() Moved rw_lock_t for locking trees from share->key_root_lock to MARIA_KEYDEF. This makes usage of the locks simpler and faster include/my_base.h: Added HA_RTREE_INDEX flag to mark rtree index. Used for easier checks in ma_check() Added SEARCH_INSERT to be used when inserting new keys Added SEARCH_PART_KEY for partial searches Added SEARCH_USER_KEY_HAS_TRANSID to be used when key we use for searching in btree has a TRANSID Added SEARCH_PAGE_KEY_HAS_TRANSID to be used when key we found in btree has a transid include/my_handler.h: Make next_flag 32 bit to make sure we can handle all SEARCH_ bits mysql-test/include/maria_empty_logs.inc: Read and restore current database; Don't assume we are using mysqltest. Don't log use databasename to log. Using this include should not cause any result changes. mysql-test/r/maria-gis-rtree-dynamic.result: Updated results after adding some check table commands to help pinpoint errors mysql-test/r/maria-mvcc.result: New tests mysql-test/r/maria-purge.result: New result after adding removal of logs mysql-test/r/maria-recovery-big.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria-recovery-bitmap.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria-recovery-rtree-ft.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria-recovery.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria.result: New tests mysql-test/r/variables-big.result: Don't log id as it's not predictable mysql-test/suite/rpl_ndb/r/rpl_truncate_7ndb_2.result: Updated results to new binlog results. (Test has not been run in a long time as it requires --big) mysql-test/suite/rpl_ndb/t/rpl_truncate_7ndb_2-master.opt: Moved file to ndb replication test directory mysql-test/suite/rpl_ndb/t/rpl_truncate_7ndb_2.test: Fixed wrong path to included tests mysql-test/t/maria-gis-rtree-dynamic.test: Added some check table commands to help pinpoint errors mysql-test/t/maria-mvcc.test: New tests mysql-test/t/maria-purge.test: Remove logs to make test results predictable mysql-test/t/maria.test: New tests for some possible problems mysql-test/t/variables-big.test: Don't log id as it's not predictable mysys/my_handler.c: Updated function comment to reflect old code Changed nextflag to be uint32 to ensure we can have flags > 16 bit Changed checking if we are in insert with NULL keys as next_flag can now include additional bits that have to be ignored. Added SEARCH_INSERT flag to be used when inserting new keys in btree. This flag tells us the that the keys includes row position and it's thus safe to remove SEARCH_FIND Added comparision of transid. This is only done if the keys actually have a transid, which is indicated by nextflag mysys/my_lock.c: Fixed wrong test (Found by Guilhem) scripts/Makefile.am: Ensure that test programs are deleted by make clean sql/rpl_rli.cc: Moved assignment order to fix compiler warning storage/heap/hp_write.c: Add SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys storage/maria/Makefile.am: Remove also maria log files when doing make distclean storage/maria/ha_maria.cc: Use 'file->start_state' as default state for transactional tables without versioning At table unlock, set file->state to point to live state. (Needed for information schema to pick up right number of rows) In ha_maria::implicit_commit() move all locked (ie open) tables to new transaction. This is needed to ensure ha_maria->info doesn't point to a deleted history event. Disable concurrent inserts for insert ... select and table changes with subqueries if statement based replication as this would cause wrong results on slave storage/maria/ma_blockrec.c: Updated comment storage/maria/ma_check.c: Compact key pages (removes transid) when doing --zerofill Check that 'page_flag' on key pages contains KEYPAGE_FLAG_HAS_TRANSID if there is a single key on the page with a transid Modified functions to use MARIA_KEY instead of key pointer and key length Use new interface to _ma_rec_pos(), _ma_dpointer(), _ma_ft_del(), ma_update_state_lsn() Removed not needed argument from get_record_for_key() Fixed that we check doesn't give errors for RTREE; We now treath these like SPATIAL Remove some SPATIAL specific code where the virtual functions can handle this in a general manner Use info->lastkey_buff instead of info->lastkey _ma_dpos() -> _ma_row_pos_from_key() _ma_make_key() -> keyinfo->make_key() _ma_print_key() -> _ma_print_keydata() _ma_move_key() -> ma_copy_copy() Add SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys Ensure that data on page doesn't overwrite page checksum position Use DBUG_DUMP_KEY instead of DBUG_DUMP Use exact key lengths instead of USE_WHOLE_KEY to ha_key_cmp() Fixed check if rowid points outside of BLOCK_RECORD data file Use info->lastkey_buff instead of key on stack in some safe places Added #fdef EXTERNAL_LOCKING around removed state elements storage/maria/ma_close.c: Use keyinfo->root_lock instead of share->key_root_lock[keynr] storage/maria/ma_create.c: Removed assert that is already checked in maria_init() Force transactinal tables to be of type BLOCK_RECORD Fixed wrong usage of HA_PACK_RECORD (should be HA_OPTION_PACK_RECORD) Mark keys that uses HA_KEY_ALG_RTREE with HA_RTREE_INDEX for easier handling of these in ma_check Store max_trid in index file as state.create_trid. This is used to pack all transids in the index pages relative to max possible transid for file. storage/maria/ma_dbug.c: Changed _ma_print_key() to use MARIA_KEY storage/maria/ma_delete.c: Modified functions to use MARIA_KEY instead of key pointer and key length info->lastkey2-> info->lastkey_buff2 Added SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys Use new interface for get_key(), _ma_get_last_key() and others _ma_dpos() -> ma_row_pos_from_key() Simplify setting of prev_key in del() Ensure that KEYPAGE_FLAG_HAS_TRANSID is set in page_flag if key page has transid Treath key pages that may have a transid as if keys would be of variable length storage/maria/ma_delete_all.c: Reset history state if maria_delete_all_rows() are called Update parameters to _ma_update_state_lsns() call storage/maria/ma_extra.c: Store and restore info->lastkey storage/maria/ma_ft_boolean_search.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_ft_nlq_search.c: Modified functions to use MARIA_KEY instead of key pointer and key length Use lastkey_buff2 instead of info->lastkey+info->s->base.max_key_length (same thing) storage/maria/ma_ft_update.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_ftdefs.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_fulltext.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_init.c: Check if blocksize is legal (Moved test here from ma_open()) storage/maria/ma_key.c: Added functions for storing/reading of transid Modified functions to use MARIA_KEY instead of key pointer and key length Moved _ma_sp_make_key() out of _ma_make_key() as we now use keyinfo->make_key to create keys Add transid to keys if table is versioned Added _ma_copy_key() storage/maria/ma_key_recover.c: Add logging of page_flag (holds information if there are keys with transid on page) Changed DBUG_PRINT("info" -> DBUG_PRINT("redo" as the redo logging can be quite extensive Added lots of DBUG_PRINT() Added support for index page operations: KEY_OP_SET_PAGEFLAG and KEY_OP_COMPACT_PAGE storage/maria/ma_key_recover.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_locking.c: Added new arguments to _ma_update_state_lsns_sub() storage/maria/ma_loghandler.c: Fixed all logging of LSN to look similar in DBUG log Changed if (left != 0) to if (left) as the later is used also later in the code storage/maria/ma_loghandler.h: Added new index page operations storage/maria/ma_open.c: Removed allocated "state_dummy" and instead use share->state.common for transactional tables that are not versioned This is needed to not get double increments of state.records (one in ma_write.c and on when log is written) Changed info->lastkey to MARIA_KEY type Removed resetting of MARIA_HA variables that have 0 as default value (as info is zerofilled) Enable versioning for transactional tables with index. Tables with an auto-increment key, rtree or fulltext keys are not versioned. Check on open that state.create_trid is correct Extend share->base.max_key_length in case of transactional table so that it can hold transid Removed 4.0 compatible fulltext key mode as this is not relevant for Maria Removed old and wrong #ifdef ENABLE_WHEN_WE_HAVE_TRANS_ROW_ID code block Initialize all new virtual function pointers Removed storing of state->unique, state->process and store state->create_trid instead storage/maria/ma_page.c: Added comment to describe key page structure Added functions to compact key page and log the compact operation storage/maria/ma_range.c: Modified functions to use MARIA_KEY instead of key pointer and key length Use SEARCH_PART_KEY indicator instead of USE_WHOLE_KEY to detect if we are doing a part key search Added handling of pages with transid storage/maria/ma_recovery.c: Don't assert if table we opened are not transactional. This may be a table which has been changed from transactional to not transactinal Added new arguments to _ma_update_state_lsns() storage/maria/ma_rename.c: Added new arguments to _ma_update_state_lsns() storage/maria/ma_rkey.c: Modified functions to use MARIA_KEY instead of key pointer and key length Don't use USE_WHOLE_KEY, use real length of key Use share->row_is_visible() to test if row is visible Moved search_flag == HA_READ_KEY_EXACT out of 'read-next-row' loop as this only need to be tested once Removed test if last_used_keyseg != 0 as this is always true storage/maria/ma_rnext.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify code by using local variable keyinfo instead if share->keyinfo[i] Use share->row_is_visible() to test if row is visible storage/maria/ma_rnext_same.c: Modified functions to use MARIA_KEY instead of key pointer and key length lastkey2 -> lastkey_buff2 storage/maria/ma_rprev.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify code by using local variable keyinfo instead if share->keyinfo[i] Use share->row_is_visible() to test if row is visible storage/maria/ma_rsame.c: Updated comment Simplify code by using local variable keyinfo instead if share->keyinfo[i] Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rsamepos.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rt_index.c: Modified functions to use MARIA_KEY instead of key pointer and key length Use better variable names Removed not needed casts _ma_dpos() -> _ma_row_pos_from_key() Use info->last_rtree_keypos to save position to key instead of info->int_keypos Simplify err: condition Changed return type for maria_rtree_insert() to my_bool as we are only intressed in ok/fail from this function storage/maria/ma_rt_index.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rt_key.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify maria_rtree_add_key by combining idenitcal code and removing added_len storage/maria/ma_rt_key.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rt_mbr.c: Changed type of 'nextflag' to uint32 Added 'to' argument to RT_PAGE_MBR_XXX functions to more clearly see which variables changes value storage/maria/ma_rt_mbr.h: Changed type of 'nextflag' to uint32 storage/maria/ma_rt_split.c: Modified functions to use MARIA_KEY instead of key pointer and key length key_length -> key_data_length to catch possible errors storage/maria/ma_rt_test.c: Fixed wrong comment Reset recinfo to avoid valgrind varnings Fixed wrong argument to create_record() that caused test to fail storage/maria/ma_search.c: Modified functions to use MARIA_KEY instead of key pointer and key length Added support of keys with optional trid Test for SEARCH_PART_KEY instead of USE_WHOLE_KEY to detect part key reads _ma_dpos() -> _ma_row_pos_from_key() If there may be keys with transid on the page, have _ma_bin_search() call _ma_seq_search() Add _ma_skip_xxx() functions to quickly step over keys (faster than calling get_key() in most cases as we don't have to copy key data) Combine similar code at end of _ma_get_binary_pack_key() Removed not used function _ma_move_key() In _ma_search_next() don't call _ma_search() if we aren't on a nod page. Update info->cur_row.trid with trid for found key Removed some not needed casts Added _ma_trid_from_key() Use MARIA_SHARE instead of MARIA_HA as arguments to _ma_rec_pos(), _ma_dpointer() and _ma_xxx_keypos_to_recpos() to make functions faster and smaller storage/maria/ma_sort.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_sp_defs.h: _ma_sp_make_key() now fills in and returns (MARIA_KEY *) value storage/maria/ma_sp_key.c: _ma_sp_make_key() now fills in and returns (MARIA_KEY *) value Don't test sizeof(double), test against 8 as we are using float8store() Use mi_float8store() instead of doing swap of value (same thing but faster) storage/maria/ma_state.c: maria_versioning() now only calls _ma_block_get_status() if table supports versioning Added _ma_row_visible_xxx() functions for different occasions When emptying history, set info->state to point to the first history event. storage/maria/ma_state.h: Added _ma_row_visible_xxx() prototypes storage/maria/ma_static.c: Indentation changes storage/maria/ma_statrec.c: Fixed arguments to _ma_dpointer() and _ma_rec_pos() storage/maria/ma_test1.c: Call init_thr_lock() if we have versioning storage/maria/ma_test2.c: Call init_thr_lock() if we have versioning storage/maria/ma_unique.c: Modified functions to use MARIA_KEY storage/maria/ma_update.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_write.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify code by using local variable keyinfo instead if share->keyinfo[i] In _ma_enlarge_root(), mark in page_flag if new key has transid _ma_dpos() -> _ma_row_pos_from_key() Changed return type of _ma_ck_write_tree() to my_bool as we are only testing if result is true or not Moved 'reversed' to outside block as area was used later storage/maria/maria_chk.c: Added error if trying to sort with HA_BINARY_PACK_KEY Use new interface to get_key() and _ma_dpointer() _ma_dpos() -> _ma_row_pos_from_key() storage/maria/maria_def.h: Modified functions to use MARIA_KEY instead of key pointer and key length Added 'common' to MARIA_SHARE->state for storing state for transactional tables without versioning Added create_trid to MARIA_SHARE Removed not used state variables 'process' and 'unique' Added defines for handling TRID's in index pages Changed to use MARIA_SHARE instead of MARIA_HA for some functions Added 'have_versioning' flag if table supports versioning Moved key_root_lock from MARIA_SHARE to MARIA_KEYDEF Changed last_key to be of type MARIA_KEY. Removed lastkey_length lastkey -> lastkey_buff, lastkey2 -> lastkey_buff2 Added _ma_get_used_and_nod_with_flag() for faster access to page data when page_flag is read Added DBUG_DUMP_KEY for easier DBUG_DUMP of a key Changed 'nextflag' and assocaited variables to uint32 storage/maria/maria_ftdump.c: lastkey -> lastkey_buff storage/maria/trnman.c: Fixed wrong initialization of min_read_from and max_commit_trid Added trnman_get_min_safe_trid() storage/maria/unittest/ma_test_all-t: Added --start-from storage/myisam/mi_check.c: Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order storage/myisam/mi_delete.c: Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order storage/myisam/mi_range.c: Updated comment storage/myisam/mi_write.c: Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order storage/myisam/rt_index.c: Fixed wrong parameter to rtree_get_req() which could cause crash
2008-06-26 07:18:28 +02:00
if (_ma_update_state_lsns(info->s, rec->lsn, trnman_get_min_safe_trid(),
TRUE, !(param.testflag & T_NO_CREATE_RENAME_LSN)))
goto end;
error= 0;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
end:
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_RETURN(error);
}
prototype_redo_exec_hook(REDO_DROP_TABLE)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
char *name;
int error= 1;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_HA *info;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (skip_DDLs)
{
tprint(tracef, "we skip DDLs\n");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
return 0;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 1;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
name= (char *)log_record_buffer.str;
tprint(tracef, "Table '%s'", name);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
info= maria_open(name, O_RDONLY, HA_OPEN_FOR_REPAIR);
if (info)
{
MARIA_SHARE *share= info->s;
if (!share->base.born_transactional)
{
tprint(tracef, ", is not transactional, ignoring removal\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
ALERT_USER();
error= 0;
goto end;
}
if (cmp_translog_addr(share->state.create_rename_lsn, rec->lsn) >= 0)
{
tprint(tracef, ", has create_rename_lsn (%lu,0x%lx) more recent than"
" record, ignoring removal",
LSN_IN_PARTS(share->state.create_rename_lsn));
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
error= 0;
goto end;
}
if (maria_is_crashed(info))
{
tprint(tracef, ", is crashed, can't drop it");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
ALERT_USER();
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
goto end;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
Changed all file names in maria to LEX_STRING and removed some calls to strlen() Ensure that pagecache gives correct error number even if error for block happend mysys/my_pread.c: Indentation fix storage/maria/ha_maria.cc: filenames changed to be of type LEX_STRING storage/maria/ma_check.c: filenames changed to be of type LEX_STRING storage/maria/ma_checkpoint.c: filenames changed to be of type LEX_STRING storage/maria/ma_create.c: filenames changed to be of type LEX_STRING storage/maria/ma_dbug.c: filenames changed to be of type LEX_STRING storage/maria/ma_delete.c: filenames changed to be of type LEX_STRING storage/maria/ma_info.c: filenames changed to be of type LEX_STRING storage/maria/ma_keycache.c: filenames changed to be of type LEX_STRING storage/maria/ma_locking.c: filenames changed to be of type LEX_STRING storage/maria/ma_loghandler.c: filenames changed to be of type LEX_STRING storage/maria/ma_open.c: filenames changed to be of type LEX_STRING storage/maria/ma_pagecache.c: Store error number for last failed operation in the page block This should fix some asserts() when errno was not properly set after failure to read block in another thread storage/maria/ma_recovery.c: filenames changed to be of type LEX_STRING storage/maria/ma_update.c: filenames changed to be of type LEX_STRING storage/maria/ma_write.c: filenames changed to be of type LEX_STRING storage/maria/maria_def.h: filenames changed to be of type LEX_STRING storage/maria/maria_ftdump.c: filenames changed to be of type LEX_STRING storage/maria/maria_pack.c: filenames changed to be of type LEX_STRING
2008-08-25 13:49:47 +02:00
if (close_one_table(info->s->open_file_name.str, rec->lsn) ||
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
maria_close(info))
goto end;
info= NULL;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
/* if it is older, or its header is corrupted, drop it */
tprint(tracef, ", dropping '%s'", name);
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (maria_delete_table(name))
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to drop table");
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
goto end;
}
}
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
else /* one or two files absent, or header corrupted... */
tprint(tracef,", can't be opened, probably does not exist");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
error= 0;
end:
tprint(tracef, "\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info != NULL)
error|= maria_close(info);
return error;
}
prototype_redo_exec_hook(FILE_ID)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
uint16 sid;
int error= 1;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
const char *name;
MARIA_HA *info;
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_ENTER("exec_REDO_LOGREC_FILE_ID");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (cmp_translog_addr(rec->lsn, checkpoint_start) < 0)
{
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
/*
If that mapping was still true at checkpoint time, it was found in
checkpoint record, no need to recreate it. If that mapping had ended at
checkpoint time (table was closed or repaired), a flush and force
happened and so mapping is not needed.
*/
tprint(tracef, "ignoring because before checkpoint\n");
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_RETURN(0);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
sid= fileid_korr(log_record_buffer.str);
info= all_tables[sid].info;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info != NULL)
{
Changed all file names in maria to LEX_STRING and removed some calls to strlen() Ensure that pagecache gives correct error number even if error for block happend mysys/my_pread.c: Indentation fix storage/maria/ha_maria.cc: filenames changed to be of type LEX_STRING storage/maria/ma_check.c: filenames changed to be of type LEX_STRING storage/maria/ma_checkpoint.c: filenames changed to be of type LEX_STRING storage/maria/ma_create.c: filenames changed to be of type LEX_STRING storage/maria/ma_dbug.c: filenames changed to be of type LEX_STRING storage/maria/ma_delete.c: filenames changed to be of type LEX_STRING storage/maria/ma_info.c: filenames changed to be of type LEX_STRING storage/maria/ma_keycache.c: filenames changed to be of type LEX_STRING storage/maria/ma_locking.c: filenames changed to be of type LEX_STRING storage/maria/ma_loghandler.c: filenames changed to be of type LEX_STRING storage/maria/ma_open.c: filenames changed to be of type LEX_STRING storage/maria/ma_pagecache.c: Store error number for last failed operation in the page block This should fix some asserts() when errno was not properly set after failure to read block in another thread storage/maria/ma_recovery.c: filenames changed to be of type LEX_STRING storage/maria/ma_update.c: filenames changed to be of type LEX_STRING storage/maria/ma_write.c: filenames changed to be of type LEX_STRING storage/maria/maria_def.h: filenames changed to be of type LEX_STRING storage/maria/maria_ftdump.c: filenames changed to be of type LEX_STRING storage/maria/maria_pack.c: filenames changed to be of type LEX_STRING
2008-08-25 13:49:47 +02:00
tprint(tracef, " Closing table '%s'\n", info->s->open_file_name.str);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
prepare_table_for_close(info, rec->lsn);
if (maria_close(info))
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to close table");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_tables[sid].info= NULL;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
name= (char *)log_record_buffer.str + FILEID_STORE_SIZE;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
if (new_table(sid, name, rec->lsn))
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
goto end;
error= 0;
end:
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
DBUG_RETURN(error);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
static int new_table(uint16 sid, const char *name, LSN lsn_of_file_id)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
/*
-1 (skip table): close table and return 0;
1 (error): close table and return 1;
0 (success): leave table open and return 0.
*/
int error= 1;
MARIA_HA *info;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
my_off_t dfile_len, kfile_len;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
checkpoint_useful= TRUE;
if ((name == NULL) || (name[0] == 0))
{
/*
we didn't use DBUG_ASSERT() because such record corruption could
silently pass in the "info == NULL" test below.
*/
tprint(tracef, ", record is corrupted");
info= NULL;
Fix for LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery More DBUG_PRINT (to simplify future debugging) Aria: Added STATE_IN_REPAIR, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Aria: Some trivial speedup optimization Aria: Better warning if table was marked crashed by unfinnished repair mysql-test/lib/v1/mysql-test-run.pl: Fix so one can run RQG mysql-test/suite/maria/r/maria-recovery2.result: Update for new error message. mysys/stacktrace.c: Fixed compiler warning storage/maria/ha_maria.cc: More DBUG_PRINT Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Don't log query for dropping temporary table. storage/maria/ha_maria.h: Added prototype for drop_table() storage/maria/ma_blockrec.c: More DBUG_PRINT Make read_long_data() inline for most cases. (Trivial speedup optimization) storage/maria/ma_check.c: Better warning if table was marked crashed by unfinnished repair storage/maria/ma_open.c: More DBUG_PRINT storage/maria/ma_recovery.c: Give warning if found crashed table. Changed warning for tables that can't be opened. storage/maria/ma_recovery_util.c: Write warnings to DBUG file storage/maria/maria_chk.c: Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. storage/maria/maria_def.h: Added maria_mark_in_repair(x) storage/maria/maria_read_log.c: Added option: --character-sets-dir storage/maria/trnman.c: By default set min_read_from to max value. This allows us to remove TRN:s from rows during recovery to get more space. This fixes bug LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery
2010-07-30 09:45:27 +02:00
recovery_warnings++;
goto end;
}
tprint(tracef, "Table '%s', id %u", name, sid);
info= maria_open(name, O_RDWR, HA_OPEN_FOR_REPAIR);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info == NULL)
{
tprint(tracef, ", is absent (must have been dropped later?)"
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
" or its header is so corrupted that we cannot open it;"
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
" we skip it");
Fix for LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery More DBUG_PRINT (to simplify future debugging) Aria: Added STATE_IN_REPAIR, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Aria: Some trivial speedup optimization Aria: Better warning if table was marked crashed by unfinnished repair mysql-test/lib/v1/mysql-test-run.pl: Fix so one can run RQG mysql-test/suite/maria/r/maria-recovery2.result: Update for new error message. mysys/stacktrace.c: Fixed compiler warning storage/maria/ha_maria.cc: More DBUG_PRINT Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Don't log query for dropping temporary table. storage/maria/ha_maria.h: Added prototype for drop_table() storage/maria/ma_blockrec.c: More DBUG_PRINT Make read_long_data() inline for most cases. (Trivial speedup optimization) storage/maria/ma_check.c: Better warning if table was marked crashed by unfinnished repair storage/maria/ma_open.c: More DBUG_PRINT storage/maria/ma_recovery.c: Give warning if found crashed table. Changed warning for tables that can't be opened. storage/maria/ma_recovery_util.c: Write warnings to DBUG file storage/maria/maria_chk.c: Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. storage/maria/maria_def.h: Added maria_mark_in_repair(x) storage/maria/maria_read_log.c: Added option: --character-sets-dir storage/maria/trnman.c: By default set min_read_from to max value. This allows us to remove TRN:s from rows during recovery to get more space. This fixes bug LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery
2010-07-30 09:45:27 +02:00
if (my_errno != ENOENT)
recovery_found_crashed_tables++;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
error= 0;
goto end;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* check that we're not already using it */
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
if (share->reopen != 1)
{
tprint(tracef, ", is already open (reopen=%u)\n", share->reopen);
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
/*
It could be that we have in the log
FILE_ID(t1,10) ... (t1 was flushed) ... FILE_ID(t1,12);
*/
Changed all file names in maria to LEX_STRING and removed some calls to strlen() Ensure that pagecache gives correct error number even if error for block happend mysys/my_pread.c: Indentation fix storage/maria/ha_maria.cc: filenames changed to be of type LEX_STRING storage/maria/ma_check.c: filenames changed to be of type LEX_STRING storage/maria/ma_checkpoint.c: filenames changed to be of type LEX_STRING storage/maria/ma_create.c: filenames changed to be of type LEX_STRING storage/maria/ma_dbug.c: filenames changed to be of type LEX_STRING storage/maria/ma_delete.c: filenames changed to be of type LEX_STRING storage/maria/ma_info.c: filenames changed to be of type LEX_STRING storage/maria/ma_keycache.c: filenames changed to be of type LEX_STRING storage/maria/ma_locking.c: filenames changed to be of type LEX_STRING storage/maria/ma_loghandler.c: filenames changed to be of type LEX_STRING storage/maria/ma_open.c: filenames changed to be of type LEX_STRING storage/maria/ma_pagecache.c: Store error number for last failed operation in the page block This should fix some asserts() when errno was not properly set after failure to read block in another thread storage/maria/ma_recovery.c: filenames changed to be of type LEX_STRING storage/maria/ma_update.c: filenames changed to be of type LEX_STRING storage/maria/ma_write.c: filenames changed to be of type LEX_STRING storage/maria/maria_def.h: filenames changed to be of type LEX_STRING storage/maria/maria_ftdump.c: filenames changed to be of type LEX_STRING storage/maria/maria_pack.c: filenames changed to be of type LEX_STRING
2008-08-25 13:49:47 +02:00
if (close_one_table(share->open_file_name.str, lsn_of_file_id))
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
goto end;
WL#3072 Maria Recovery * recovery from ha_maria now skips replaying DDLs (too dangerous) * maria_read_log still replays DDLs, print warning about issues * fixes to replaying of REDO_RENAME * don't replay DDLs on corrupted tables (safer) * print a one-line message when really doing a recovery (applies to ha_maria, not maria_read_log) i.e. some REDOs or UNDOs are read. storage/maria/ma_checkpoint.c: fix for assertion failure storage/maria/ma_recovery.c: * Recovery from ha_maria now skips replaying DDLs (as the initial plan said) as this is unsafe in case of crashes during the DDL; applying the records may do harm (destroy important files) so we prefer to leave the "mess" of files untouched. A proper recovery of DDLs requires very careful thinking, probably testing separately the existence of the data and index file instead of using maria_open() which tests the existence of both, and maybe storing create_rename_lsn in the data file too. * maria_read_log still replays DDLs, we print a warning about dangers (due to ALTER TABLE not logging insertions into the tmp table; we will maybe need an option to have logging of those insertions). * fixes to replaying of REDO_RENAME (test create_rename_lsn of 'new_name' table if it exists; if that table exists and is more recent than the record, remove the 'old_name' table). * don't replay DDLs on corrupted tables (play safe) * fail also in non-debug builds if table is open when it should not be (when creating it for example, it should not be already open). * when the trace file is not stdout (i.e. when this is ha_maria), if really doing a recovery (reading REDOs or UNDOs), print a one-line message to stderr to inform about start and end of recovery (useful to know what mysqld is doing, especially if it takes long or crashes). storage/maria/ma_recovery.h: parameter to replay DDLs or not storage/maria/maria_read_log.c: replay DDLs in maria_read_log, to be able to recreate tables from scratch.
2007-09-15 14:45:26 +02:00
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (!share->base.born_transactional)
{
Added versioning of Maria index Store max_trid in index file as state.create_trid. This is used to pack all transids in the index pages relative to max possible transid for file. Enable versioning for transactional tables with index. Tables with an auto-increment key, rtree or fulltext keys are not versioned. Changed info->lastkey to type MARIA_KEY. Removed info->lastkey_length as this is now part of info->lastkey Renamed old info->lastkey to info->lastkey_buff Use exact key lenghts for keys, not USE_WHOLE_KEY For partial key searches, use SEARCH_PART_KEY When searching to insert new key on page, use SEARCH_INSERT to mark that key has rowid Changes done in a lot of files: - Modified functions to use MARIA_KEY instead of key pointer and key length - Use keyinfo->root_lock instead of share->key_root_lock[keynr] - Simplify code by using local variable keyinfo instead if share->keyinfo[i] - Added #fdef EXTERNAL_LOCKING around removed state elements - HA_MAX_KEY_BUFF -> MARIA_MAX_KEY_BUFF (to reserve space for transid) - Changed type of 'nextflag' to uint32 to ensure all SEARCH_xxx flags fits into it .bzrignore: Added missing temporary directory extra/Makefile.am: comp_err is now deleted on make distclean include/maria.h: Added structure MARIA_KEY, which is used for intern key objects in Maria. Changed functions to take MARIA_KEY as an argument instead of pointer to packed key. Changed some functions that always return true or false to my_bool. Added virtual function make_key() to avoid if in _ma_make_key() Moved rw_lock_t for locking trees from share->key_root_lock to MARIA_KEYDEF. This makes usage of the locks simpler and faster include/my_base.h: Added HA_RTREE_INDEX flag to mark rtree index. Used for easier checks in ma_check() Added SEARCH_INSERT to be used when inserting new keys Added SEARCH_PART_KEY for partial searches Added SEARCH_USER_KEY_HAS_TRANSID to be used when key we use for searching in btree has a TRANSID Added SEARCH_PAGE_KEY_HAS_TRANSID to be used when key we found in btree has a transid include/my_handler.h: Make next_flag 32 bit to make sure we can handle all SEARCH_ bits mysql-test/include/maria_empty_logs.inc: Read and restore current database; Don't assume we are using mysqltest. Don't log use databasename to log. Using this include should not cause any result changes. mysql-test/r/maria-gis-rtree-dynamic.result: Updated results after adding some check table commands to help pinpoint errors mysql-test/r/maria-mvcc.result: New tests mysql-test/r/maria-purge.result: New result after adding removal of logs mysql-test/r/maria-recovery-big.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria-recovery-bitmap.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria-recovery-rtree-ft.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria-recovery.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria.result: New tests mysql-test/r/variables-big.result: Don't log id as it's not predictable mysql-test/suite/rpl_ndb/r/rpl_truncate_7ndb_2.result: Updated results to new binlog results. (Test has not been run in a long time as it requires --big) mysql-test/suite/rpl_ndb/t/rpl_truncate_7ndb_2-master.opt: Moved file to ndb replication test directory mysql-test/suite/rpl_ndb/t/rpl_truncate_7ndb_2.test: Fixed wrong path to included tests mysql-test/t/maria-gis-rtree-dynamic.test: Added some check table commands to help pinpoint errors mysql-test/t/maria-mvcc.test: New tests mysql-test/t/maria-purge.test: Remove logs to make test results predictable mysql-test/t/maria.test: New tests for some possible problems mysql-test/t/variables-big.test: Don't log id as it's not predictable mysys/my_handler.c: Updated function comment to reflect old code Changed nextflag to be uint32 to ensure we can have flags > 16 bit Changed checking if we are in insert with NULL keys as next_flag can now include additional bits that have to be ignored. Added SEARCH_INSERT flag to be used when inserting new keys in btree. This flag tells us the that the keys includes row position and it's thus safe to remove SEARCH_FIND Added comparision of transid. This is only done if the keys actually have a transid, which is indicated by nextflag mysys/my_lock.c: Fixed wrong test (Found by Guilhem) scripts/Makefile.am: Ensure that test programs are deleted by make clean sql/rpl_rli.cc: Moved assignment order to fix compiler warning storage/heap/hp_write.c: Add SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys storage/maria/Makefile.am: Remove also maria log files when doing make distclean storage/maria/ha_maria.cc: Use 'file->start_state' as default state for transactional tables without versioning At table unlock, set file->state to point to live state. (Needed for information schema to pick up right number of rows) In ha_maria::implicit_commit() move all locked (ie open) tables to new transaction. This is needed to ensure ha_maria->info doesn't point to a deleted history event. Disable concurrent inserts for insert ... select and table changes with subqueries if statement based replication as this would cause wrong results on slave storage/maria/ma_blockrec.c: Updated comment storage/maria/ma_check.c: Compact key pages (removes transid) when doing --zerofill Check that 'page_flag' on key pages contains KEYPAGE_FLAG_HAS_TRANSID if there is a single key on the page with a transid Modified functions to use MARIA_KEY instead of key pointer and key length Use new interface to _ma_rec_pos(), _ma_dpointer(), _ma_ft_del(), ma_update_state_lsn() Removed not needed argument from get_record_for_key() Fixed that we check doesn't give errors for RTREE; We now treath these like SPATIAL Remove some SPATIAL specific code where the virtual functions can handle this in a general manner Use info->lastkey_buff instead of info->lastkey _ma_dpos() -> _ma_row_pos_from_key() _ma_make_key() -> keyinfo->make_key() _ma_print_key() -> _ma_print_keydata() _ma_move_key() -> ma_copy_copy() Add SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys Ensure that data on page doesn't overwrite page checksum position Use DBUG_DUMP_KEY instead of DBUG_DUMP Use exact key lengths instead of USE_WHOLE_KEY to ha_key_cmp() Fixed check if rowid points outside of BLOCK_RECORD data file Use info->lastkey_buff instead of key on stack in some safe places Added #fdef EXTERNAL_LOCKING around removed state elements storage/maria/ma_close.c: Use keyinfo->root_lock instead of share->key_root_lock[keynr] storage/maria/ma_create.c: Removed assert that is already checked in maria_init() Force transactinal tables to be of type BLOCK_RECORD Fixed wrong usage of HA_PACK_RECORD (should be HA_OPTION_PACK_RECORD) Mark keys that uses HA_KEY_ALG_RTREE with HA_RTREE_INDEX for easier handling of these in ma_check Store max_trid in index file as state.create_trid. This is used to pack all transids in the index pages relative to max possible transid for file. storage/maria/ma_dbug.c: Changed _ma_print_key() to use MARIA_KEY storage/maria/ma_delete.c: Modified functions to use MARIA_KEY instead of key pointer and key length info->lastkey2-> info->lastkey_buff2 Added SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys Use new interface for get_key(), _ma_get_last_key() and others _ma_dpos() -> ma_row_pos_from_key() Simplify setting of prev_key in del() Ensure that KEYPAGE_FLAG_HAS_TRANSID is set in page_flag if key page has transid Treath key pages that may have a transid as if keys would be of variable length storage/maria/ma_delete_all.c: Reset history state if maria_delete_all_rows() are called Update parameters to _ma_update_state_lsns() call storage/maria/ma_extra.c: Store and restore info->lastkey storage/maria/ma_ft_boolean_search.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_ft_nlq_search.c: Modified functions to use MARIA_KEY instead of key pointer and key length Use lastkey_buff2 instead of info->lastkey+info->s->base.max_key_length (same thing) storage/maria/ma_ft_update.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_ftdefs.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_fulltext.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_init.c: Check if blocksize is legal (Moved test here from ma_open()) storage/maria/ma_key.c: Added functions for storing/reading of transid Modified functions to use MARIA_KEY instead of key pointer and key length Moved _ma_sp_make_key() out of _ma_make_key() as we now use keyinfo->make_key to create keys Add transid to keys if table is versioned Added _ma_copy_key() storage/maria/ma_key_recover.c: Add logging of page_flag (holds information if there are keys with transid on page) Changed DBUG_PRINT("info" -> DBUG_PRINT("redo" as the redo logging can be quite extensive Added lots of DBUG_PRINT() Added support for index page operations: KEY_OP_SET_PAGEFLAG and KEY_OP_COMPACT_PAGE storage/maria/ma_key_recover.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_locking.c: Added new arguments to _ma_update_state_lsns_sub() storage/maria/ma_loghandler.c: Fixed all logging of LSN to look similar in DBUG log Changed if (left != 0) to if (left) as the later is used also later in the code storage/maria/ma_loghandler.h: Added new index page operations storage/maria/ma_open.c: Removed allocated "state_dummy" and instead use share->state.common for transactional tables that are not versioned This is needed to not get double increments of state.records (one in ma_write.c and on when log is written) Changed info->lastkey to MARIA_KEY type Removed resetting of MARIA_HA variables that have 0 as default value (as info is zerofilled) Enable versioning for transactional tables with index. Tables with an auto-increment key, rtree or fulltext keys are not versioned. Check on open that state.create_trid is correct Extend share->base.max_key_length in case of transactional table so that it can hold transid Removed 4.0 compatible fulltext key mode as this is not relevant for Maria Removed old and wrong #ifdef ENABLE_WHEN_WE_HAVE_TRANS_ROW_ID code block Initialize all new virtual function pointers Removed storing of state->unique, state->process and store state->create_trid instead storage/maria/ma_page.c: Added comment to describe key page structure Added functions to compact key page and log the compact operation storage/maria/ma_range.c: Modified functions to use MARIA_KEY instead of key pointer and key length Use SEARCH_PART_KEY indicator instead of USE_WHOLE_KEY to detect if we are doing a part key search Added handling of pages with transid storage/maria/ma_recovery.c: Don't assert if table we opened are not transactional. This may be a table which has been changed from transactional to not transactinal Added new arguments to _ma_update_state_lsns() storage/maria/ma_rename.c: Added new arguments to _ma_update_state_lsns() storage/maria/ma_rkey.c: Modified functions to use MARIA_KEY instead of key pointer and key length Don't use USE_WHOLE_KEY, use real length of key Use share->row_is_visible() to test if row is visible Moved search_flag == HA_READ_KEY_EXACT out of 'read-next-row' loop as this only need to be tested once Removed test if last_used_keyseg != 0 as this is always true storage/maria/ma_rnext.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify code by using local variable keyinfo instead if share->keyinfo[i] Use share->row_is_visible() to test if row is visible storage/maria/ma_rnext_same.c: Modified functions to use MARIA_KEY instead of key pointer and key length lastkey2 -> lastkey_buff2 storage/maria/ma_rprev.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify code by using local variable keyinfo instead if share->keyinfo[i] Use share->row_is_visible() to test if row is visible storage/maria/ma_rsame.c: Updated comment Simplify code by using local variable keyinfo instead if share->keyinfo[i] Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rsamepos.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rt_index.c: Modified functions to use MARIA_KEY instead of key pointer and key length Use better variable names Removed not needed casts _ma_dpos() -> _ma_row_pos_from_key() Use info->last_rtree_keypos to save position to key instead of info->int_keypos Simplify err: condition Changed return type for maria_rtree_insert() to my_bool as we are only intressed in ok/fail from this function storage/maria/ma_rt_index.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rt_key.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify maria_rtree_add_key by combining idenitcal code and removing added_len storage/maria/ma_rt_key.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rt_mbr.c: Changed type of 'nextflag' to uint32 Added 'to' argument to RT_PAGE_MBR_XXX functions to more clearly see which variables changes value storage/maria/ma_rt_mbr.h: Changed type of 'nextflag' to uint32 storage/maria/ma_rt_split.c: Modified functions to use MARIA_KEY instead of key pointer and key length key_length -> key_data_length to catch possible errors storage/maria/ma_rt_test.c: Fixed wrong comment Reset recinfo to avoid valgrind varnings Fixed wrong argument to create_record() that caused test to fail storage/maria/ma_search.c: Modified functions to use MARIA_KEY instead of key pointer and key length Added support of keys with optional trid Test for SEARCH_PART_KEY instead of USE_WHOLE_KEY to detect part key reads _ma_dpos() -> _ma_row_pos_from_key() If there may be keys with transid on the page, have _ma_bin_search() call _ma_seq_search() Add _ma_skip_xxx() functions to quickly step over keys (faster than calling get_key() in most cases as we don't have to copy key data) Combine similar code at end of _ma_get_binary_pack_key() Removed not used function _ma_move_key() In _ma_search_next() don't call _ma_search() if we aren't on a nod page. Update info->cur_row.trid with trid for found key Removed some not needed casts Added _ma_trid_from_key() Use MARIA_SHARE instead of MARIA_HA as arguments to _ma_rec_pos(), _ma_dpointer() and _ma_xxx_keypos_to_recpos() to make functions faster and smaller storage/maria/ma_sort.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_sp_defs.h: _ma_sp_make_key() now fills in and returns (MARIA_KEY *) value storage/maria/ma_sp_key.c: _ma_sp_make_key() now fills in and returns (MARIA_KEY *) value Don't test sizeof(double), test against 8 as we are using float8store() Use mi_float8store() instead of doing swap of value (same thing but faster) storage/maria/ma_state.c: maria_versioning() now only calls _ma_block_get_status() if table supports versioning Added _ma_row_visible_xxx() functions for different occasions When emptying history, set info->state to point to the first history event. storage/maria/ma_state.h: Added _ma_row_visible_xxx() prototypes storage/maria/ma_static.c: Indentation changes storage/maria/ma_statrec.c: Fixed arguments to _ma_dpointer() and _ma_rec_pos() storage/maria/ma_test1.c: Call init_thr_lock() if we have versioning storage/maria/ma_test2.c: Call init_thr_lock() if we have versioning storage/maria/ma_unique.c: Modified functions to use MARIA_KEY storage/maria/ma_update.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_write.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify code by using local variable keyinfo instead if share->keyinfo[i] In _ma_enlarge_root(), mark in page_flag if new key has transid _ma_dpos() -> _ma_row_pos_from_key() Changed return type of _ma_ck_write_tree() to my_bool as we are only testing if result is true or not Moved 'reversed' to outside block as area was used later storage/maria/maria_chk.c: Added error if trying to sort with HA_BINARY_PACK_KEY Use new interface to get_key() and _ma_dpointer() _ma_dpos() -> _ma_row_pos_from_key() storage/maria/maria_def.h: Modified functions to use MARIA_KEY instead of key pointer and key length Added 'common' to MARIA_SHARE->state for storing state for transactional tables without versioning Added create_trid to MARIA_SHARE Removed not used state variables 'process' and 'unique' Added defines for handling TRID's in index pages Changed to use MARIA_SHARE instead of MARIA_HA for some functions Added 'have_versioning' flag if table supports versioning Moved key_root_lock from MARIA_SHARE to MARIA_KEYDEF Changed last_key to be of type MARIA_KEY. Removed lastkey_length lastkey -> lastkey_buff, lastkey2 -> lastkey_buff2 Added _ma_get_used_and_nod_with_flag() for faster access to page data when page_flag is read Added DBUG_DUMP_KEY for easier DBUG_DUMP of a key Changed 'nextflag' and assocaited variables to uint32 storage/maria/maria_ftdump.c: lastkey -> lastkey_buff storage/maria/trnman.c: Fixed wrong initialization of min_read_from and max_commit_trid Added trnman_get_min_safe_trid() storage/maria/unittest/ma_test_all-t: Added --start-from storage/myisam/mi_check.c: Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order storage/myisam/mi_delete.c: Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order storage/myisam/mi_range.c: Updated comment storage/myisam/mi_write.c: Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order storage/myisam/rt_index.c: Fixed wrong parameter to rtree_get_req() which could cause crash
2008-06-26 07:18:28 +02:00
/*
This can happen if one converts a transactional table to a
not transactional table
*/
tprint(tracef, ", is not transactional. Ignoring open request");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
error= -1;
Fix for LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery More DBUG_PRINT (to simplify future debugging) Aria: Added STATE_IN_REPAIR, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Aria: Some trivial speedup optimization Aria: Better warning if table was marked crashed by unfinnished repair mysql-test/lib/v1/mysql-test-run.pl: Fix so one can run RQG mysql-test/suite/maria/r/maria-recovery2.result: Update for new error message. mysys/stacktrace.c: Fixed compiler warning storage/maria/ha_maria.cc: More DBUG_PRINT Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Don't log query for dropping temporary table. storage/maria/ha_maria.h: Added prototype for drop_table() storage/maria/ma_blockrec.c: More DBUG_PRINT Make read_long_data() inline for most cases. (Trivial speedup optimization) storage/maria/ma_check.c: Better warning if table was marked crashed by unfinnished repair storage/maria/ma_open.c: More DBUG_PRINT storage/maria/ma_recovery.c: Give warning if found crashed table. Changed warning for tables that can't be opened. storage/maria/ma_recovery_util.c: Write warnings to DBUG file storage/maria/maria_chk.c: Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. storage/maria/maria_def.h: Added maria_mark_in_repair(x) storage/maria/maria_read_log.c: Added option: --character-sets-dir storage/maria/trnman.c: By default set min_read_from to max value. This allows us to remove TRN:s from rows during recovery to get more space. This fixes bug LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery
2010-07-30 09:45:27 +02:00
recovery_warnings++;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
goto end;
}
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
if (cmp_translog_addr(lsn_of_file_id, share->state.create_rename_lsn) <= 0)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
tprint(tracef, ", has create_rename_lsn (%lu,0x%lx) more recent than"
" LOGREC_FILE_ID's LSN (%lu,0x%lx), ignoring open request",
LSN_IN_PARTS(share->state.create_rename_lsn),
LSN_IN_PARTS(lsn_of_file_id));
Fix for LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery More DBUG_PRINT (to simplify future debugging) Aria: Added STATE_IN_REPAIR, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Aria: Some trivial speedup optimization Aria: Better warning if table was marked crashed by unfinnished repair mysql-test/lib/v1/mysql-test-run.pl: Fix so one can run RQG mysql-test/suite/maria/r/maria-recovery2.result: Update for new error message. mysys/stacktrace.c: Fixed compiler warning storage/maria/ha_maria.cc: More DBUG_PRINT Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Don't log query for dropping temporary table. storage/maria/ha_maria.h: Added prototype for drop_table() storage/maria/ma_blockrec.c: More DBUG_PRINT Make read_long_data() inline for most cases. (Trivial speedup optimization) storage/maria/ma_check.c: Better warning if table was marked crashed by unfinnished repair storage/maria/ma_open.c: More DBUG_PRINT storage/maria/ma_recovery.c: Give warning if found crashed table. Changed warning for tables that can't be opened. storage/maria/ma_recovery_util.c: Write warnings to DBUG file storage/maria/maria_chk.c: Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. storage/maria/maria_def.h: Added maria_mark_in_repair(x) storage/maria/maria_read_log.c: Added option: --character-sets-dir storage/maria/trnman.c: By default set min_read_from to max value. This allows us to remove TRN:s from rows during recovery to get more space. This fixes bug LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery
2010-07-30 09:45:27 +02:00
recovery_warnings++;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
error= -1;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
/*
Note that we tested that before testing corruption; a recent corrupted
table is not a blocker for the present log record.
*/
}
if (maria_is_crashed(info))
{
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
eprint(tracef, "Table '%s' is crashed, skipping it. Please repair it with"
Changed all file names in maria to LEX_STRING and removed some calls to strlen() Ensure that pagecache gives correct error number even if error for block happend mysys/my_pread.c: Indentation fix storage/maria/ha_maria.cc: filenames changed to be of type LEX_STRING storage/maria/ma_check.c: filenames changed to be of type LEX_STRING storage/maria/ma_checkpoint.c: filenames changed to be of type LEX_STRING storage/maria/ma_create.c: filenames changed to be of type LEX_STRING storage/maria/ma_dbug.c: filenames changed to be of type LEX_STRING storage/maria/ma_delete.c: filenames changed to be of type LEX_STRING storage/maria/ma_info.c: filenames changed to be of type LEX_STRING storage/maria/ma_keycache.c: filenames changed to be of type LEX_STRING storage/maria/ma_locking.c: filenames changed to be of type LEX_STRING storage/maria/ma_loghandler.c: filenames changed to be of type LEX_STRING storage/maria/ma_open.c: filenames changed to be of type LEX_STRING storage/maria/ma_pagecache.c: Store error number for last failed operation in the page block This should fix some asserts() when errno was not properly set after failure to read block in another thread storage/maria/ma_recovery.c: filenames changed to be of type LEX_STRING storage/maria/ma_update.c: filenames changed to be of type LEX_STRING storage/maria/ma_write.c: filenames changed to be of type LEX_STRING storage/maria/maria_def.h: filenames changed to be of type LEX_STRING storage/maria/maria_ftdump.c: filenames changed to be of type LEX_STRING storage/maria/maria_pack.c: filenames changed to be of type LEX_STRING
2008-08-25 13:49:47 +02:00
" maria_chk -r", share->open_file_name.str);
Fix for LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery More DBUG_PRINT (to simplify future debugging) Aria: Added STATE_IN_REPAIR, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Aria: Some trivial speedup optimization Aria: Better warning if table was marked crashed by unfinnished repair mysql-test/lib/v1/mysql-test-run.pl: Fix so one can run RQG mysql-test/suite/maria/r/maria-recovery2.result: Update for new error message. mysys/stacktrace.c: Fixed compiler warning storage/maria/ha_maria.cc: More DBUG_PRINT Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Don't log query for dropping temporary table. storage/maria/ha_maria.h: Added prototype for drop_table() storage/maria/ma_blockrec.c: More DBUG_PRINT Make read_long_data() inline for most cases. (Trivial speedup optimization) storage/maria/ma_check.c: Better warning if table was marked crashed by unfinnished repair storage/maria/ma_open.c: More DBUG_PRINT storage/maria/ma_recovery.c: Give warning if found crashed table. Changed warning for tables that can't be opened. storage/maria/ma_recovery_util.c: Write warnings to DBUG file storage/maria/maria_chk.c: Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. storage/maria/maria_def.h: Added maria_mark_in_repair(x) storage/maria/maria_read_log.c: Added option: --character-sets-dir storage/maria/trnman.c: By default set min_read_from to max value. This allows us to remove TRN:s from rows during recovery to get more space. This fixes bug LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery
2010-07-30 09:45:27 +02:00
recovery_found_crashed_tables++;
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
error= -1; /* not fatal, try with other tables */
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
goto end;
WL#4374 "Maria - force start if Recovery fails multiple times" http://forge.mysql.com/worklog/task.php?id=4374 new option --maria-force-start-after-recovery-failures=N; number of consecutive recovery failures (failures of log reading or recovery processing, anything in [translog_init(),maria_recovery_from_log()]) is stored in the control file; if at a Maria start they are more than N, logs are removed. This is for automated systems which have to run whatever happens. As tables risk staying corrupted, --maria-recover should also be used on them: this revision makes maria-recover work (it was disabled). Fixed bug in translog_is_log_files(). translog_init() now prints message to error log if failed. Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there. KNOWN_BUGS.txt: As option --maria-force-start-after-recovery-failures is added, it corresponds to the wish "we should fix that if this happens etc". LOAD INDEX is not ignored since a few weeks. Listed concurrency bugs have been fixed some time ago. Recovery of fulltext and GIS indexes works since a few weeks. mysql-test/include/maria_make_snapshot.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/include/maria_make_snapshot_for_comparison.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/include/maria_verify_recovery.inc: configurable prefix in table's name (so far 't' or 't_corrupted') mysql-test/lib/mtr_report.pl: new test maria-recover.test generates expected corruption warnings in the error log. maria-recovery.test's corrupted table is renamed to t_corrupted1 instead of t1. mysql-test/r/maria-preload.result: result update. maria_pagecache_read* values are similar to the previous version of this file, though a bit bigger because using the information_schema and the join leads to some internal maria temp table being used, and thus some blocks of it being read. mysql-test/r/maria-purge.result: engine's name in SHOW ENGINE MARIA LOGS changed. mysql-test/r/maria-recover.result: result for new test. We see corruption messages at first SELECT and then none at second SELECT, expected. mysql-test/r/maria-recovery.result: result update mysql-test/r/maria.result: new variables show up mysql-test/t/disabled.def: BUG#34911 is not fixed but the test had been made independent of the bug (workaround). A new bug (crash) has popped recently, so it has to stay disabled (BUG#35107). mysql-test/t/maria-preload.test: Work around BUG#34911 "FLUSH STATUS doesn't flush what it should": compute differences in status variables before and after relevant queries mysql-test/t/maria-recover-master.opt: test --maria-recover mysql-test/t/maria-recover.test: Test of the --maria-recover option (build a corrupted table and see if it is auto-repaired) mysql-test/t/maria-recovery-big.test: update for new API of include/maria*.inc mysql-test/t/maria-recovery-bitmap.test: update for new API of include/maria*.inc mysql-test/t/maria-recovery.test: update for new API of include/maria*.inc. Corrupted table t1 renamed to t_corrupted1, so that mtr_report.pl does not blindly remove all corruption messages for t1 which is a common name. storage/maria/ha_maria.cc: Enabling maria-recover. Adding option and global variable --maria_force_start_after_recovery_failures: ha_maria_init() calls mark_recovery_start() and mark_recovery_success() to keep track of failed consecutive recoveries and remove logs if needed. Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there. storage/maria/ma_checkpoint.c: new prototype storage/maria/ma_control_file.c: Storing in one byte in the control file, the number of consecutive recovery failures. storage/maria/ma_control_file.h: new prototype storage/maria/ma_init.c: new prototype storage/maria/ma_locking.c: Need to update open_count on disk at first write and close for transactional tables, like we already did for non-transactional tables, otherwise we cannot notice that the table is dubious. storage/maria/ma_loghandler.c: translog_is_log_files() is made more generic to serve either to search or to delete logs (the latter is for --maria-force-start-after-recovery-failures). It also had a bug (always returned FALSE). storage/maria/ma_loghandler.h: export function because ha_maria::mark_recovery_start() needs it storage/maria/ma_recovery.c: changing name of maria_recover() to distinguish from the maria-recover option. storage/maria/ma_recovery.h: changing name of maria_recover() to distinguish from the maria-recover option. storage/maria/ma_test_force_start.pl: Test of --maria-force-start-after-recovery-failures (and also, to be realistic, of --maria-recover). This is standalone because mysql-test-run does not support testing that multiple mysqld restarts expectedly failed. I'll have to run it on my machine and also on a Windows machine. storage/maria/unittest/ma_control_file-t.c: adding recovery_failures to the test storage/maria/unittest/ma_test_loghandler_multigroup-t.c: fix for compiler warning (unused variable in non-debug build)
2008-06-02 22:53:25 +02:00
/*
Note that if a first recovery fails to apply a REDO, it marks the table
corrupted and stops the entire recovery. A second recovery will find the
table is marked corrupted and skip it (and thus possibly handle other
tables).
*/
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
/* don't log any records for this work */
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
_ma_tmp_disable_logging_for_table(info, FALSE);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* execution of some REDO records relies on data_file_length */
dfile_len= my_seek(info->dfile.file, 0, SEEK_END, MYF(MY_WME));
kfile_len= my_seek(info->s->kfile.file, 0, SEEK_END, MYF(MY_WME));
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if ((dfile_len == MY_FILEPOS_ERROR) ||
(kfile_len == MY_FILEPOS_ERROR))
{
tprint(tracef, ", length unknown\n");
Fix for LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery More DBUG_PRINT (to simplify future debugging) Aria: Added STATE_IN_REPAIR, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Aria: Some trivial speedup optimization Aria: Better warning if table was marked crashed by unfinnished repair mysql-test/lib/v1/mysql-test-run.pl: Fix so one can run RQG mysql-test/suite/maria/r/maria-recovery2.result: Update for new error message. mysys/stacktrace.c: Fixed compiler warning storage/maria/ha_maria.cc: More DBUG_PRINT Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. Don't log query for dropping temporary table. storage/maria/ha_maria.h: Added prototype for drop_table() storage/maria/ma_blockrec.c: More DBUG_PRINT Make read_long_data() inline for most cases. (Trivial speedup optimization) storage/maria/ma_check.c: Better warning if table was marked crashed by unfinnished repair storage/maria/ma_open.c: More DBUG_PRINT storage/maria/ma_recovery.c: Give warning if found crashed table. Changed warning for tables that can't be opened. storage/maria/ma_recovery_util.c: Write warnings to DBUG file storage/maria/maria_chk.c: Added STATE_IN_REPAIR flag, which is set on start of repair. This allows us to see if 'crashed' flag was set intentionally. storage/maria/maria_def.h: Added maria_mark_in_repair(x) storage/maria/maria_read_log.c: Added option: --character-sets-dir storage/maria/trnman.c: By default set min_read_from to max value. This allows us to remove TRN:s from rows during recovery to get more space. This fixes bug LP#602604: RQG: ma_blockrec.c:6187: _ma_apply_redo_insert_row_head_or_tail: Assertion `0' failed on Maria engine recovery
2010-07-30 09:45:27 +02:00
recovery_warnings++;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
if (share->state.state.data_file_length != dfile_len)
{
tprint(tracef, ", has wrong state.data_file_length (fixing it)");
share->state.state.data_file_length= dfile_len;
}
if (share->state.state.key_file_length != kfile_len)
{
tprint(tracef, ", has wrong state.key_file_length (fixing it)");
share->state.state.key_file_length= kfile_len;
}
Fixed repair_by_sort to work with BLOCK_RECORD Fixed bugs in undo logging Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) Reserved place for reference-transid on key pages (for packing of transids) ALTER TABLE and INSERT ... SELECT now uses fast creation of index Known bugs: ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix) ma_test_recovery.excepted is not totally correct, because of the above bug mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate include/maria.h: Prototype changes Added current_filepos to st_maria_sort_info mysql-test/r/maria.result: Updated results that changes as alter table and insert ... select now uses fast creation of index mysys/mf_iocache.c: Reset variable to gurard against double invocation storage/maria/ma_bitmap.c: Added _ma_bitmap_reset_cache() (needed for repair) storage/maria/ma_blockrec.c: Simplify code More initial allocations Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) storage/maria/ma_blockrec.h: Moved TRANSID_SIZE to maria_def.h Added prototype for new functions storage/maria/ma_check.c: Simplicy code Fixed repair_by_sort to work with BLOCK_RECORD - When using BLOCK_RECORD or UNPACK create new Maria handle - Use common initializer function - Align code with maria_repair() Made some changes to maria_repair_parallel() to use common initializer function Removed ASK_MONTY section by fixing noted problem storage/maria/ma_close.c: Moved check for readonly to _ma_state_info_write() storage/maria/ma_key_recover.c: Use different log entries if key root changes or not. This fixed some bugs when tree grows storage/maria/ma_key_recover.h: Added keynr to st_msg_to_write_hook_for_undo_key storage/maria/ma_loghandler.c: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_loghandler.h: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_open.c: Added TRANSID to all key pages (for future compressing of trans id's) For compressed records, alloc a bit bigger buffer to avoid valgrind warnings If table is opened readonly, don't update state storage/maria/ma_packrec.c: Allocate bigger array for bit unpacking to avoid valgrind errors storage/maria/ma_recovery.c: Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_sort.c: More logging storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Update results Note that this is not complete becasue of a bug in recovery storage/maria/ma_test_recovery: Removed recreation of index (not needed when we have redo for index pages) storage/maria/maria_chk.c: When using flag --read-only, don't update status for files When using --unpack, don't use REPAIR_BY_SORT if other repair option is given Enable repair_by_sort for BLOCK records Removed not needed newline at start of --describe storage/maria/maria_def.h: Support for TRANSID_SIZE to key pages storage/maria/maria_read_log.c: renamed --only-display to --display-only
2007-11-28 20:38:30 +01:00
if ((dfile_len % share->block_size) || (kfile_len % share->block_size))
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
tprint(tracef, ", has too short last page\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* Recovery will fix this, no error */
ALERT_USER();
}
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
/*
This LSN serves in this situation; assume log is:
FILE_ID(6->"t2") REDO_INSERT(6) FILE_ID(6->"t1") CHECKPOINT(6->"t1")
then crash, checkpoint record is parsed and opens "t1" with id 6; assume
REDO phase starts from the REDO_INSERT above: it will wrongly try to
update a page of "t1". With this LSN below, REDO_INSERT can realize the
mapping is newer than itself, and not execute.
Same example is possible with UNDO_INSERT (update of the state).
*/
info->s->lsn_of_file_id= lsn_of_file_id;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_tables[sid].info= info;
/*
We don't set info->s->id, it would be useless (no logging in REDO phase);
if you change that, know that some records in REDO phase call
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
_ma_update_state_lsns() which resets info->s->id.
*/
tprint(tracef, ", opened");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
error= 0;
end:
tprint(tracef, "\n");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (error)
{
if (info != NULL)
maria_close(info);
if (error == -1)
error= 0;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
return error;
}
/*
NOTE
This is called for REDO_INSERT_ROW_HEAD and READ_NEW_ROW_HEAD
*/
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
prototype_redo_exec_hook(REDO_INSERT_ROW_HEAD)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
int error= 1;
post-merge fixes, and fixes for some of the 16 compiler warnings found in pushbuild on sapsrv1. Some not fixed as not repeatable on my machine (32/64 bit issue?). Fixes for some test failures: - "maria-connect" now passes; - "maria": after fixing the obvious reasons for failures, the test went further and hit a more complex issues: difference in the output of EXPLAIN output; not fixed; - "ps_maria" still crashes in assertion mysqld: ha_maria.cc:1627: virtual int ha_maria::index_read(uchar*, const uchar*, uint, ha_rkey_function): Ass ertion `inited == INDEX' failed, as already observable in pushbuild. All this might just be due to an incomplete merge of MyISAM changes into Maria when 5.1 was last merged to mysql-maria. include/my_global.h: temporary fix until next merge of 5.1; without this it does not build mysql-test/r/maria-connect.result: position changed mysql-test/t/maria-connect.test: If one wants to use the binlog it has to ask for it. 1582 is not used for dup entry error anymore (it was in older 5.1). Size of first event in binlog was increased by 4 (when the new type of event "gap" was added). mysql-test/t/maria.test: 1582 not used anymore in this case storage/maria/ha_maria.cc: engine now has to say what binlogging it supports storage/maria/ma_blockrec.c: fix for compiler warnings ("comparison is always true" or "always false") storage/maria/ma_loghandler.c: fix for compiler warnings (comparing char* to uchar*) storage/maria/ma_packrec.c: fix for compiler warning (fix simply merged from MyISAM) storage/maria/ma_pagecache.c: info_check_pin() was not used so gave a compiler warning. storage/maria/ma_pagecache.h: fixing typo from the last 5.1->maria merge. storage/maria/ma_recovery.c: my_free() has a void* argument, so why cast. byte->uchar. storage/maria/ma_search.c: fix for compiler warning (fix simply merged from MyISAM) storage/maria/maria_read_log.c: gptr->uchar* storage/maria/trnman.c: probable fix for warning found in pushbuild (but not on my machine): storage/maria/trnman.c: 142 passing argument 6 of \u2018lf_hash_init\u2019 from incompatible pointer type on sapsrv1.
2007-07-26 17:51:49 +02:00
uchar *buff= NULL;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL || maria_is_crashed(info))
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
{
/*
Table was skipped at open time (because later dropped/renamed, not
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
transactional, or create_rename_lsn newer than LOGREC_FILE_ID), or
record was skipped due to skip_redo_lsn; it is not an error.
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
*/
return 0;
}
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
/*
Note that REDO is per page, we still consider it if its transaction
committed long ago and is unknown.
*/
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/*
If REDO's LSN is > page's LSN (read from disk), we are going to modify the
page and change its LSN. The normal runtime code stores the UNDO's LSN
into the page. Here storing the REDO's LSN (rec->lsn) would work
(we are not writing to the log here, so don't have to "flush up to UNDO's
LSN"). But in a test scenario where we do updates at runtime, then remove
tables, apply the log and check that this results in the same table as at
runtime, putting the same LSN as runtime had done will decrease
differences. So we use the UNDO's LSN which is current_group_end_lsn.
*/
enlarge_buffer(rec);
if (log_record_buffer.str == NULL)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read allocate buffer for record");
goto end;
}
if (translog_read_record(rec->lsn, 0, rec->record_length,
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
log_record_buffer.str, NULL) !=
rec->record_length)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
buff= log_record_buffer.str;
if (_ma_apply_redo_insert_row_head_or_tail(info, current_group_end_lsn,
HEAD_PAGE,
(rec->type ==
LOGREC_REDO_NEW_ROW_HEAD),
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
buff + FILEID_STORE_SIZE,
buff +
FILEID_STORE_SIZE +
PAGE_STORE_SIZE +
DIRPOS_STORE_SIZE,
rec->record_length -
(FILEID_STORE_SIZE +
PAGE_STORE_SIZE +
DIRPOS_STORE_SIZE)))
goto end;
error= 0;
end:
return error;
}
/*
NOTE
This is called for REDO_INSERT_ROW_TAIL and READ_NEW_ROW_TAIL
*/
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
prototype_redo_exec_hook(REDO_INSERT_ROW_TAIL)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
int error= 1;
uchar *buff;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL || maria_is_crashed(info))
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
}
buff= log_record_buffer.str;
if (_ma_apply_redo_insert_row_head_or_tail(info, current_group_end_lsn,
TAIL_PAGE,
(rec->type ==
LOGREC_REDO_NEW_ROW_TAIL),
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
buff + FILEID_STORE_SIZE,
buff +
FILEID_STORE_SIZE +
PAGE_STORE_SIZE +
DIRPOS_STORE_SIZE,
rec->record_length -
(FILEID_STORE_SIZE +
PAGE_STORE_SIZE +
DIRPOS_STORE_SIZE)))
goto end;
error= 0;
end:
return error;
}
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
prototype_redo_exec_hook(REDO_INSERT_ROW_BLOBS)
{
int error= 1;
uchar *buff;
Fix for BUG#41493 "Maria: two recovery failures (wrong logging of BLOB pages)" and some more debugging output related to this. mysql-test/suite/maria/r/maria-recovery3.result: result update mysql-test/suite/maria/t/maria-recovery3.test: Test for bug; before the fix, the "CHECK TABLE EXTENDED" would mention a bad bitmap, because the REDO_INSERT_ROW_BLOBS was containing a page number which was actually the one of a tail, so execution of this record would mark the tail page as full in bitmap (like if it were a blob page), though it wasn't full. Also, the assertion added around ma_blockrec.c:6580 in the present revision fired. storage/maria/ma_blockrec.c: - fix for BUG#41493: if we found out that logging was not needed at this point (blob_length==0 i.e. tail page), then we forgot to increment tmp_block, so in the second iteration (assuming two BLOB columns), we would log the page range of the first iteration (i.e. the tail page's number) for this second BLOB, which would cause Recovery to overwrite the tail page with the second BLOB. - assert when marking the table corrupted during REDO phase; this catches some problems earlier otherwise they get caught only when a later record wants to use the table. - _ma_apply_redo_insert_row_blobs() now fills some synthetic info about the blobs and pages involved in a REDO_INSERT_ROW_BLOBS record, for inclusion into maria_recovery.trace: number of blobs, of ranges, first and last page (does not tell about any gaps in the middle, but good enough for now). It also asserts that it's not overwriting a tail/head page (which happened in the bug). storage/maria/ma_blockrec.h: new prototype for _ma_apply_redo_insert_row_blobs storage/maria/ma_recovery.c: Print info got from _ma_apply_redo_insert_row_blobs() to maria_recovery.trace (so far this file had mentioned what head and tail pages a record touched, but not blob pages).
2009-01-15 16:14:47 +01:00
uint number_of_blobs, number_of_ranges;
pgcache_page_no_t first_page, last_page;
char llbuf1[22], llbuf2[22];
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL || maria_is_crashed(info))
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
return 0;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
goto end;
}
buff= log_record_buffer.str;
if (_ma_apply_redo_insert_row_blobs(info, current_group_end_lsn,
Fix for BUG#41493 "Maria: two recovery failures (wrong logging of BLOB pages)" and some more debugging output related to this. mysql-test/suite/maria/r/maria-recovery3.result: result update mysql-test/suite/maria/t/maria-recovery3.test: Test for bug; before the fix, the "CHECK TABLE EXTENDED" would mention a bad bitmap, because the REDO_INSERT_ROW_BLOBS was containing a page number which was actually the one of a tail, so execution of this record would mark the tail page as full in bitmap (like if it were a blob page), though it wasn't full. Also, the assertion added around ma_blockrec.c:6580 in the present revision fired. storage/maria/ma_blockrec.c: - fix for BUG#41493: if we found out that logging was not needed at this point (blob_length==0 i.e. tail page), then we forgot to increment tmp_block, so in the second iteration (assuming two BLOB columns), we would log the page range of the first iteration (i.e. the tail page's number) for this second BLOB, which would cause Recovery to overwrite the tail page with the second BLOB. - assert when marking the table corrupted during REDO phase; this catches some problems earlier otherwise they get caught only when a later record wants to use the table. - _ma_apply_redo_insert_row_blobs() now fills some synthetic info about the blobs and pages involved in a REDO_INSERT_ROW_BLOBS record, for inclusion into maria_recovery.trace: number of blobs, of ranges, first and last page (does not tell about any gaps in the middle, but good enough for now). It also asserts that it's not overwriting a tail/head page (which happened in the bug). storage/maria/ma_blockrec.h: new prototype for _ma_apply_redo_insert_row_blobs storage/maria/ma_recovery.c: Print info got from _ma_apply_redo_insert_row_blobs() to maria_recovery.trace (so far this file had mentioned what head and tail pages a record touched, but not blob pages).
2009-01-15 16:14:47 +01:00
buff, rec->lsn, &number_of_blobs,
&number_of_ranges,
&first_page, &last_page))
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
goto end;
Fix for BUG#41493 "Maria: two recovery failures (wrong logging of BLOB pages)" and some more debugging output related to this. mysql-test/suite/maria/r/maria-recovery3.result: result update mysql-test/suite/maria/t/maria-recovery3.test: Test for bug; before the fix, the "CHECK TABLE EXTENDED" would mention a bad bitmap, because the REDO_INSERT_ROW_BLOBS was containing a page number which was actually the one of a tail, so execution of this record would mark the tail page as full in bitmap (like if it were a blob page), though it wasn't full. Also, the assertion added around ma_blockrec.c:6580 in the present revision fired. storage/maria/ma_blockrec.c: - fix for BUG#41493: if we found out that logging was not needed at this point (blob_length==0 i.e. tail page), then we forgot to increment tmp_block, so in the second iteration (assuming two BLOB columns), we would log the page range of the first iteration (i.e. the tail page's number) for this second BLOB, which would cause Recovery to overwrite the tail page with the second BLOB. - assert when marking the table corrupted during REDO phase; this catches some problems earlier otherwise they get caught only when a later record wants to use the table. - _ma_apply_redo_insert_row_blobs() now fills some synthetic info about the blobs and pages involved in a REDO_INSERT_ROW_BLOBS record, for inclusion into maria_recovery.trace: number of blobs, of ranges, first and last page (does not tell about any gaps in the middle, but good enough for now). It also asserts that it's not overwriting a tail/head page (which happened in the bug). storage/maria/ma_blockrec.h: new prototype for _ma_apply_redo_insert_row_blobs storage/maria/ma_recovery.c: Print info got from _ma_apply_redo_insert_row_blobs() to maria_recovery.trace (so far this file had mentioned what head and tail pages a record touched, but not blob pages).
2009-01-15 16:14:47 +01:00
llstr(first_page, llbuf1);
llstr(last_page, llbuf2);
tprint(tracef, " %u blobs %u ranges, first page %s last %s",
number_of_blobs, number_of_ranges, llbuf1, llbuf2);
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
error= 0;
end:
Fix for BUG#41493 "Maria: two recovery failures (wrong logging of BLOB pages)" and some more debugging output related to this. mysql-test/suite/maria/r/maria-recovery3.result: result update mysql-test/suite/maria/t/maria-recovery3.test: Test for bug; before the fix, the "CHECK TABLE EXTENDED" would mention a bad bitmap, because the REDO_INSERT_ROW_BLOBS was containing a page number which was actually the one of a tail, so execution of this record would mark the tail page as full in bitmap (like if it were a blob page), though it wasn't full. Also, the assertion added around ma_blockrec.c:6580 in the present revision fired. storage/maria/ma_blockrec.c: - fix for BUG#41493: if we found out that logging was not needed at this point (blob_length==0 i.e. tail page), then we forgot to increment tmp_block, so in the second iteration (assuming two BLOB columns), we would log the page range of the first iteration (i.e. the tail page's number) for this second BLOB, which would cause Recovery to overwrite the tail page with the second BLOB. - assert when marking the table corrupted during REDO phase; this catches some problems earlier otherwise they get caught only when a later record wants to use the table. - _ma_apply_redo_insert_row_blobs() now fills some synthetic info about the blobs and pages involved in a REDO_INSERT_ROW_BLOBS record, for inclusion into maria_recovery.trace: number of blobs, of ranges, first and last page (does not tell about any gaps in the middle, but good enough for now). It also asserts that it's not overwriting a tail/head page (which happened in the bug). storage/maria/ma_blockrec.h: new prototype for _ma_apply_redo_insert_row_blobs storage/maria/ma_recovery.c: Print info got from _ma_apply_redo_insert_row_blobs() to maria_recovery.trace (so far this file had mentioned what head and tail pages a record touched, but not blob pages).
2009-01-15 16:14:47 +01:00
tprint(tracef, " \n");
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
return error;
}
prototype_redo_exec_hook(REDO_PURGE_ROW_HEAD)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL || maria_is_crashed(info))
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (_ma_apply_redo_purge_row_head_or_tail(info, current_group_end_lsn,
HEAD_PAGE,
rec->header + FILEID_STORE_SIZE))
goto end;
error= 0;
end:
return error;
}
prototype_redo_exec_hook(REDO_PURGE_ROW_TAIL)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL || maria_is_crashed(info))
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (_ma_apply_redo_purge_row_head_or_tail(info, current_group_end_lsn,
TAIL_PAGE,
rec->header + FILEID_STORE_SIZE))
goto end;
error= 0;
end:
return error;
}
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
prototype_redo_exec_hook(REDO_FREE_BLOCKS)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
int error= 1;
uchar *buff;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL || maria_is_crashed(info))
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
goto end;
}
buff= log_record_buffer.str;
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
if (_ma_apply_redo_free_blocks(info, current_group_end_lsn,
buff + FILEID_STORE_SIZE))
goto end;
error= 0;
end:
return error;
}
prototype_redo_exec_hook(REDO_FREE_HEAD_OR_TAIL)
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL || maria_is_crashed(info))
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
return 0;
if (_ma_apply_redo_free_head_or_tail(info, current_group_end_lsn,
rec->header + FILEID_STORE_SIZE))
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
goto end;
error= 0;
end:
return error;
}
prototype_redo_exec_hook(REDO_DELETE_ALL)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
tprint(tracef, " deleting all %lu rows\n",
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
(ulong)info->s->state.state.records);
if (maria_delete_all_rows(info))
goto end;
error= 0;
end:
return error;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_redo_exec_hook(REDO_INDEX)
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL || maria_is_crashed(info))
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 0;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
goto end;
}
if (_ma_apply_redo_index(info, current_group_end_lsn,
log_record_buffer.str + FILEID_STORE_SIZE,
rec->record_length - FILEID_STORE_SIZE))
goto end;
error= 0;
end:
return error;
}
prototype_redo_exec_hook(REDO_INDEX_NEW_PAGE)
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL || maria_is_crashed(info))
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 0;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
goto end;
}
if (_ma_apply_redo_index_new_page(info, current_group_end_lsn,
log_record_buffer.str + FILEID_STORE_SIZE,
rec->record_length - FILEID_STORE_SIZE))
goto end;
error= 0;
end:
return error;
}
prototype_redo_exec_hook(REDO_INDEX_FREE_PAGE)
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL || maria_is_crashed(info))
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 0;
if (_ma_apply_redo_index_free_page(info, current_group_end_lsn,
rec->header + FILEID_STORE_SIZE))
goto end;
error= 0;
end:
return error;
}
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
prototype_redo_exec_hook(REDO_BITMAP_NEW_PAGE)
{
int error= 1;
MARIA_HA *info= get_MARIA_HA_from_REDO_record(rec);
if (info == NULL || maria_is_crashed(info))
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
return 0;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
eprint(tracef, "Failed to read record");
goto end;
}
if (cmp_translog_addr(rec->lsn, checkpoint_start) >= 0)
{
/*
Record is potentially after the bitmap flush made by Checkpoint, so has
to be replayed. It may overwrite a more recent state but that will be
corrected by all upcoming REDOs for data pages.
If the condition is false, we must not apply the record: it is unneeded
and nocive (may not be corrected as REDOs can be skipped due to
dirty-pages list).
*/
if (_ma_apply_redo_bitmap_new_page(info, current_group_end_lsn,
log_record_buffer.str +
FILEID_STORE_SIZE))
goto end;
}
error= 0;
end:
return error;
}
static inline void set_undo_lsn_for_active_trans(uint16 short_trid, LSN lsn)
{
if (all_active_trans[short_trid].long_trid == 0)
{
/* transaction unknown, so has committed or fully rolled back long ago */
return;
}
all_active_trans[short_trid].undo_lsn= lsn;
if (all_active_trans[short_trid].first_undo_lsn == LSN_IMPOSSIBLE)
all_active_trans[short_trid].first_undo_lsn= lsn;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
prototype_redo_exec_hook(UNDO_ROW_INSERT)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
set_undo_lsn_for_active_trans(rec->short_trid, rec->lsn);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info == NULL)
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
{
/*
Note that we set undo_lsn anyway. So that if the transaction is later
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
rolled back, this UNDO is tried for execution and we get a warning (as
it would then be abnormal that info==NULL).
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
*/
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
if (cmp_translog_addr(rec->lsn, share->state.is_of_horizon) >= 0)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
tprint(tracef, " state has LSN (%lu,0x%lx) older than record, updating"
" rows' count\n", LSN_IN_PARTS(share->state.is_of_horizon));
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
share->state.state.records++;
if (share->calc_checksum)
{
uchar buff[HA_CHECKSUM_STORE_SIZE];
if (translog_read_record(rec->lsn, LSN_STORE_SIZE + FILEID_STORE_SIZE +
PAGE_STORE_SIZE + DIRPOS_STORE_SIZE,
HA_CHECKSUM_STORE_SIZE, buff, NULL) !=
HA_CHECKSUM_STORE_SIZE)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
return 1;
}
share->state.state.checksum+= ha_checksum_korr(buff);
}
Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type Added --zerofill (-z) option to maria_chk (Mostly code from Jani) Added now table states ZEROFILLED and MOVEABLE Set state.changed with new states when things changes include/maria.h: Added maria_zerofill include/myisamchk.h: Added option for zerofill Extend testflag to be 64 to allow for more flags mysql-test/r/create.result: Updated results after merge mysql-test/r/maria.result: Updated results after merge mysys/my_getopt.c: Removed not used variable sql/sql_show.cc: Fixed wrong page type storage/maria/ma_blockrec.c: Renamed compact_page() to ma_compact_block_page() and made it global Always zerofill half filled blob pages Set share.state.changed on REDO storage/maria/ma_blockrec.h: Added _ma_compact_block_page() storage/maria/ma_check.c: Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type This gives the following benefits: - Table is smaller if compressed - All LSN are removed for transactinal tables and this makes them movable between systems Dont set table states of we are using --quick Changed log entry for repair to use 8 bytes for flag storage/maria/ma_delete.c: Simplify code Update state.changed storage/maria/ma_key_recover.c: Update state.changed storage/maria/ma_locking.c: Set uuid for file on first change if it's not set (table was cleared with zerofill) storage/maria/ma_loghandler.c: Updated log entry for REDO_REPAIR_TABLE storage/maria/ma_recovery.c: Updated log entry for REDO_REPAIR_TABLE (flag is now 8 bytes) Set new bits in state.changed storage/maria/ma_test_all.sh: Nicer output storage/maria/ma_test_recovery.expected: Updated results (now states flags are visible) storage/maria/ma_update.c: Update state.changed storage/maria/ma_write.c: Simplify code Set state.changed storage/maria/maria_chk.c: Added option --zerofill Added printing of states for MOVABLE and ZEROFILLED MYD -> MAD MYI -> MAI storage/maria/maria_def.h: Added states STATE_NOT_MOVABLE and STATE_NOT_ZEROFILLED Added prototype for new functions storage/maria/unittest/ma_test_all-t: More tests, including tests for zerofill Removed some not needed 'print' statements storage/maria/unittest/ma_test_loghandler_multithread-t.c: Smaller buffer to not trash devlopment machines totally
2007-12-31 10:55:46 +01:00
info->s->state.changed|= (STATE_CHANGED | STATE_NOT_ANALYZED |
STATE_NOT_ZEROFILLED | STATE_NOT_MOVABLE);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
tprint(tracef, " rows' count %lu\n", (ulong)info->s->state.state.records);
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
/* Unpin all pages, stamp them with UNDO's LSN */
_ma_unpin_all_pages(info, rec->lsn);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
prototype_redo_exec_hook(UNDO_ROW_DELETE)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
set_undo_lsn_for_active_trans(rec->short_trid, rec->lsn);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info == NULL)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
if (cmp_translog_addr(rec->lsn, share->state.is_of_horizon) >= 0)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
tprint(tracef, " state older than record\n");
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
share->state.state.records--;
if (share->calc_checksum)
{
uchar buff[HA_CHECKSUM_STORE_SIZE];
if (translog_read_record(rec->lsn, LSN_STORE_SIZE + FILEID_STORE_SIZE +
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
PAGE_STORE_SIZE + DIRPOS_STORE_SIZE + 2 +
PAGERANGE_STORE_SIZE,
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
HA_CHECKSUM_STORE_SIZE, buff, NULL) !=
HA_CHECKSUM_STORE_SIZE)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
return 1;
}
share->state.state.checksum+= ha_checksum_korr(buff);
}
Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type Added --zerofill (-z) option to maria_chk (Mostly code from Jani) Added now table states ZEROFILLED and MOVEABLE Set state.changed with new states when things changes include/maria.h: Added maria_zerofill include/myisamchk.h: Added option for zerofill Extend testflag to be 64 to allow for more flags mysql-test/r/create.result: Updated results after merge mysql-test/r/maria.result: Updated results after merge mysys/my_getopt.c: Removed not used variable sql/sql_show.cc: Fixed wrong page type storage/maria/ma_blockrec.c: Renamed compact_page() to ma_compact_block_page() and made it global Always zerofill half filled blob pages Set share.state.changed on REDO storage/maria/ma_blockrec.h: Added _ma_compact_block_page() storage/maria/ma_check.c: Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type This gives the following benefits: - Table is smaller if compressed - All LSN are removed for transactinal tables and this makes them movable between systems Dont set table states of we are using --quick Changed log entry for repair to use 8 bytes for flag storage/maria/ma_delete.c: Simplify code Update state.changed storage/maria/ma_key_recover.c: Update state.changed storage/maria/ma_locking.c: Set uuid for file on first change if it's not set (table was cleared with zerofill) storage/maria/ma_loghandler.c: Updated log entry for REDO_REPAIR_TABLE storage/maria/ma_recovery.c: Updated log entry for REDO_REPAIR_TABLE (flag is now 8 bytes) Set new bits in state.changed storage/maria/ma_test_all.sh: Nicer output storage/maria/ma_test_recovery.expected: Updated results (now states flags are visible) storage/maria/ma_update.c: Update state.changed storage/maria/ma_write.c: Simplify code Set state.changed storage/maria/maria_chk.c: Added option --zerofill Added printing of states for MOVABLE and ZEROFILLED MYD -> MAD MYI -> MAI storage/maria/maria_def.h: Added states STATE_NOT_MOVABLE and STATE_NOT_ZEROFILLED Added prototype for new functions storage/maria/unittest/ma_test_all-t: More tests, including tests for zerofill Removed some not needed 'print' statements storage/maria/unittest/ma_test_loghandler_multithread-t.c: Smaller buffer to not trash devlopment machines totally
2007-12-31 10:55:46 +01:00
share->state.changed|= (STATE_CHANGED | STATE_NOT_ANALYZED |
STATE_NOT_OPTIMIZED_ROWS | STATE_NOT_ZEROFILLED |
STATE_NOT_MOVABLE);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
tprint(tracef, " rows' count %lu\n", (ulong)share->state.state.records);
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
_ma_unpin_all_pages(info, rec->lsn);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
prototype_redo_exec_hook(UNDO_ROW_UPDATE)
{
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
set_undo_lsn_for_active_trans(rec->short_trid, rec->lsn);
if (info == NULL)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
if (cmp_translog_addr(rec->lsn, share->state.is_of_horizon) >= 0)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
if (share->calc_checksum)
{
uchar buff[HA_CHECKSUM_STORE_SIZE];
if (translog_read_record(rec->lsn, LSN_STORE_SIZE + FILEID_STORE_SIZE +
PAGE_STORE_SIZE + DIRPOS_STORE_SIZE,
HA_CHECKSUM_STORE_SIZE, buff, NULL) !=
HA_CHECKSUM_STORE_SIZE)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
return 1;
}
share->state.state.checksum+= ha_checksum_korr(buff);
}
Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type Added --zerofill (-z) option to maria_chk (Mostly code from Jani) Added now table states ZEROFILLED and MOVEABLE Set state.changed with new states when things changes include/maria.h: Added maria_zerofill include/myisamchk.h: Added option for zerofill Extend testflag to be 64 to allow for more flags mysql-test/r/create.result: Updated results after merge mysql-test/r/maria.result: Updated results after merge mysys/my_getopt.c: Removed not used variable sql/sql_show.cc: Fixed wrong page type storage/maria/ma_blockrec.c: Renamed compact_page() to ma_compact_block_page() and made it global Always zerofill half filled blob pages Set share.state.changed on REDO storage/maria/ma_blockrec.h: Added _ma_compact_block_page() storage/maria/ma_check.c: Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type This gives the following benefits: - Table is smaller if compressed - All LSN are removed for transactinal tables and this makes them movable between systems Dont set table states of we are using --quick Changed log entry for repair to use 8 bytes for flag storage/maria/ma_delete.c: Simplify code Update state.changed storage/maria/ma_key_recover.c: Update state.changed storage/maria/ma_locking.c: Set uuid for file on first change if it's not set (table was cleared with zerofill) storage/maria/ma_loghandler.c: Updated log entry for REDO_REPAIR_TABLE storage/maria/ma_recovery.c: Updated log entry for REDO_REPAIR_TABLE (flag is now 8 bytes) Set new bits in state.changed storage/maria/ma_test_all.sh: Nicer output storage/maria/ma_test_recovery.expected: Updated results (now states flags are visible) storage/maria/ma_update.c: Update state.changed storage/maria/ma_write.c: Simplify code Set state.changed storage/maria/maria_chk.c: Added option --zerofill Added printing of states for MOVABLE and ZEROFILLED MYD -> MAD MYI -> MAI storage/maria/maria_def.h: Added states STATE_NOT_MOVABLE and STATE_NOT_ZEROFILLED Added prototype for new functions storage/maria/unittest/ma_test_all-t: More tests, including tests for zerofill Removed some not needed 'print' statements storage/maria/unittest/ma_test_loghandler_multithread-t.c: Smaller buffer to not trash devlopment machines totally
2007-12-31 10:55:46 +01:00
share->state.changed|= (STATE_CHANGED | STATE_NOT_ANALYZED |
STATE_NOT_ZEROFILLED | STATE_NOT_MOVABLE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
}
_ma_unpin_all_pages(info, rec->lsn);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 0;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_redo_exec_hook(UNDO_KEY_INSERT)
{
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
MARIA_HA *info;
WL#3072 - Maria Recovery: recovery of state.auto_increment. When we log UNDO_KEY_INSERT for an auto_inc key, we update state.auto_increment (not anymore at the end of maria_write() except if this is a non-transactional table). When Recovery sees UNDO_KEY_INSERT in the REDO phase, it reads the auto_inc value from it and updates state.auto_increment. mysql-test/r/maria-recovery.result: Without the code fix, there would be in CHECK TABLE: "Auto-increment value: 0 is smaller than max used value: 3" and no AUTO_INCREMENT= clause in SHOW CREATE TABLE. mysql-test/t/maria-recovery.test: Test of recovery of state.auto_increment: from an old table, does the replaying of the log set state.auto_increment to what it should be. storage/maria/ma_check.c: new way of calling ma_retrieve_auto_increment(): pass key storage/maria/ma_key.c: ma_retrieve_auto_increment() now operates directly with a pointer to the key and not on the record. storage/maria/ma_key_recover.c: dedicated write_hook_for_undo_key_insert(): sets state.auto_increment under log's mutex. storage/maria/ma_key_recover.h: Dedicated hook for UNDO_KEY_INSERT, to set state.auto_increment. Such hook needs a new member st_msg_write_hook_for_undo_key::auto_increment, which contains the auto_increment value inserted. storage/maria/ma_loghandler.c: UNDO_KEY_INSERT gets a dedicated write_hook, to set auto_increment. storage/maria/ma_recovery.c: When in the REDO phase we see UNDO_KEY_INSERT: if the state is older than this record, and the key is the auto_increment one, read the key's value from the log record and update state.auto_increment. storage/maria/ma_test_all.sh: use $maria_path to be able to run from /dev/shm (faster) storage/maria/ma_update.c: bool is more of C++, using my_bool. If table is transactional, state.auto_increment is already updated in write_hook_for_undo_key_insert(). storage/maria/ma_write.c: If table is transactional, state.auto_increment is not updated at the end of maria_write() but rather in write_hook_for_undo_key_insert() (under log's mutex, so that a concurrent checkpoint does not read state.auto_increment while it is changing - corrupted). _ma_ck_write_btree_with_log() extracts the auto_increment value from the key, puts it into msg.auto_increment, and this is passed to write_hook_for_undo_key_insert(). storage/maria/maria_def.h: change of prototype of ma_retrieve_auto_increment() storage/maria/maria_read_log.c: use default log file size. Use separate page caches for table and logs (needed if maria_block_size!=TRANSLOG_PAGE_SIZE).
2007-12-12 22:33:36 +01:00
MARIA_SHARE *share;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
set_undo_lsn_for_active_trans(rec->short_trid, rec->lsn);
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
if (!(info= get_MARIA_HA_from_UNDO_record(rec)))
return 0;
WL#3072 - Maria Recovery: recovery of state.auto_increment. When we log UNDO_KEY_INSERT for an auto_inc key, we update state.auto_increment (not anymore at the end of maria_write() except if this is a non-transactional table). When Recovery sees UNDO_KEY_INSERT in the REDO phase, it reads the auto_inc value from it and updates state.auto_increment. mysql-test/r/maria-recovery.result: Without the code fix, there would be in CHECK TABLE: "Auto-increment value: 0 is smaller than max used value: 3" and no AUTO_INCREMENT= clause in SHOW CREATE TABLE. mysql-test/t/maria-recovery.test: Test of recovery of state.auto_increment: from an old table, does the replaying of the log set state.auto_increment to what it should be. storage/maria/ma_check.c: new way of calling ma_retrieve_auto_increment(): pass key storage/maria/ma_key.c: ma_retrieve_auto_increment() now operates directly with a pointer to the key and not on the record. storage/maria/ma_key_recover.c: dedicated write_hook_for_undo_key_insert(): sets state.auto_increment under log's mutex. storage/maria/ma_key_recover.h: Dedicated hook for UNDO_KEY_INSERT, to set state.auto_increment. Such hook needs a new member st_msg_write_hook_for_undo_key::auto_increment, which contains the auto_increment value inserted. storage/maria/ma_loghandler.c: UNDO_KEY_INSERT gets a dedicated write_hook, to set auto_increment. storage/maria/ma_recovery.c: When in the REDO phase we see UNDO_KEY_INSERT: if the state is older than this record, and the key is the auto_increment one, read the key's value from the log record and update state.auto_increment. storage/maria/ma_test_all.sh: use $maria_path to be able to run from /dev/shm (faster) storage/maria/ma_update.c: bool is more of C++, using my_bool. If table is transactional, state.auto_increment is already updated in write_hook_for_undo_key_insert(). storage/maria/ma_write.c: If table is transactional, state.auto_increment is not updated at the end of maria_write() but rather in write_hook_for_undo_key_insert() (under log's mutex, so that a concurrent checkpoint does not read state.auto_increment while it is changing - corrupted). _ma_ck_write_btree_with_log() extracts the auto_increment value from the key, puts it into msg.auto_increment, and this is passed to write_hook_for_undo_key_insert(). storage/maria/maria_def.h: change of prototype of ma_retrieve_auto_increment() storage/maria/maria_read_log.c: use default log file size. Use separate page caches for table and logs (needed if maria_block_size!=TRANSLOG_PAGE_SIZE).
2007-12-12 22:33:36 +01:00
share= info->s;
if (cmp_translog_addr(rec->lsn, share->state.is_of_horizon) >= 0)
{
const uchar *ptr= rec->header + LSN_STORE_SIZE + FILEID_STORE_SIZE;
uint keynr= key_nr_korr(ptr);
if (share->base.auto_key == (keynr + 1)) /* it's auto-increment */
{
const HA_KEYSEG *keyseg= info->s->keyinfo[keynr].seg;
ulonglong value;
char llbuf[22];
uchar *to;
tprint(tracef, " state older than record\n");
/* we read the record to find the auto_increment value */
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
WL#3072 - Maria Recovery: recovery of state.auto_increment. When we log UNDO_KEY_INSERT for an auto_inc key, we update state.auto_increment (not anymore at the end of maria_write() except if this is a non-transactional table). When Recovery sees UNDO_KEY_INSERT in the REDO phase, it reads the auto_inc value from it and updates state.auto_increment. mysql-test/r/maria-recovery.result: Without the code fix, there would be in CHECK TABLE: "Auto-increment value: 0 is smaller than max used value: 3" and no AUTO_INCREMENT= clause in SHOW CREATE TABLE. mysql-test/t/maria-recovery.test: Test of recovery of state.auto_increment: from an old table, does the replaying of the log set state.auto_increment to what it should be. storage/maria/ma_check.c: new way of calling ma_retrieve_auto_increment(): pass key storage/maria/ma_key.c: ma_retrieve_auto_increment() now operates directly with a pointer to the key and not on the record. storage/maria/ma_key_recover.c: dedicated write_hook_for_undo_key_insert(): sets state.auto_increment under log's mutex. storage/maria/ma_key_recover.h: Dedicated hook for UNDO_KEY_INSERT, to set state.auto_increment. Such hook needs a new member st_msg_write_hook_for_undo_key::auto_increment, which contains the auto_increment value inserted. storage/maria/ma_loghandler.c: UNDO_KEY_INSERT gets a dedicated write_hook, to set auto_increment. storage/maria/ma_recovery.c: When in the REDO phase we see UNDO_KEY_INSERT: if the state is older than this record, and the key is the auto_increment one, read the key's value from the log record and update state.auto_increment. storage/maria/ma_test_all.sh: use $maria_path to be able to run from /dev/shm (faster) storage/maria/ma_update.c: bool is more of C++, using my_bool. If table is transactional, state.auto_increment is already updated in write_hook_for_undo_key_insert(). storage/maria/ma_write.c: If table is transactional, state.auto_increment is not updated at the end of maria_write() but rather in write_hook_for_undo_key_insert() (under log's mutex, so that a concurrent checkpoint does not read state.auto_increment while it is changing - corrupted). _ma_ck_write_btree_with_log() extracts the auto_increment value from the key, puts it into msg.auto_increment, and this is passed to write_hook_for_undo_key_insert(). storage/maria/maria_def.h: change of prototype of ma_retrieve_auto_increment() storage/maria/maria_read_log.c: use default log file size. Use separate page caches for table and logs (needed if maria_block_size!=TRANSLOG_PAGE_SIZE).
2007-12-12 22:33:36 +01:00
return 1;
}
to= log_record_buffer.str + LSN_STORE_SIZE + FILEID_STORE_SIZE +
KEY_NR_STORE_SIZE;
if (keyseg->flag & HA_SWAP_KEY)
{
/* We put key from log record to "data record" packing format... */
Added versioning of Maria index Store max_trid in index file as state.create_trid. This is used to pack all transids in the index pages relative to max possible transid for file. Enable versioning for transactional tables with index. Tables with an auto-increment key, rtree or fulltext keys are not versioned. Changed info->lastkey to type MARIA_KEY. Removed info->lastkey_length as this is now part of info->lastkey Renamed old info->lastkey to info->lastkey_buff Use exact key lenghts for keys, not USE_WHOLE_KEY For partial key searches, use SEARCH_PART_KEY When searching to insert new key on page, use SEARCH_INSERT to mark that key has rowid Changes done in a lot of files: - Modified functions to use MARIA_KEY instead of key pointer and key length - Use keyinfo->root_lock instead of share->key_root_lock[keynr] - Simplify code by using local variable keyinfo instead if share->keyinfo[i] - Added #fdef EXTERNAL_LOCKING around removed state elements - HA_MAX_KEY_BUFF -> MARIA_MAX_KEY_BUFF (to reserve space for transid) - Changed type of 'nextflag' to uint32 to ensure all SEARCH_xxx flags fits into it .bzrignore: Added missing temporary directory extra/Makefile.am: comp_err is now deleted on make distclean include/maria.h: Added structure MARIA_KEY, which is used for intern key objects in Maria. Changed functions to take MARIA_KEY as an argument instead of pointer to packed key. Changed some functions that always return true or false to my_bool. Added virtual function make_key() to avoid if in _ma_make_key() Moved rw_lock_t for locking trees from share->key_root_lock to MARIA_KEYDEF. This makes usage of the locks simpler and faster include/my_base.h: Added HA_RTREE_INDEX flag to mark rtree index. Used for easier checks in ma_check() Added SEARCH_INSERT to be used when inserting new keys Added SEARCH_PART_KEY for partial searches Added SEARCH_USER_KEY_HAS_TRANSID to be used when key we use for searching in btree has a TRANSID Added SEARCH_PAGE_KEY_HAS_TRANSID to be used when key we found in btree has a transid include/my_handler.h: Make next_flag 32 bit to make sure we can handle all SEARCH_ bits mysql-test/include/maria_empty_logs.inc: Read and restore current database; Don't assume we are using mysqltest. Don't log use databasename to log. Using this include should not cause any result changes. mysql-test/r/maria-gis-rtree-dynamic.result: Updated results after adding some check table commands to help pinpoint errors mysql-test/r/maria-mvcc.result: New tests mysql-test/r/maria-purge.result: New result after adding removal of logs mysql-test/r/maria-recovery-big.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria-recovery-bitmap.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria-recovery-rtree-ft.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria-recovery.result: maria_empty_logs doesn't log 'use mysqltest' anymore mysql-test/r/maria.result: New tests mysql-test/r/variables-big.result: Don't log id as it's not predictable mysql-test/suite/rpl_ndb/r/rpl_truncate_7ndb_2.result: Updated results to new binlog results. (Test has not been run in a long time as it requires --big) mysql-test/suite/rpl_ndb/t/rpl_truncate_7ndb_2-master.opt: Moved file to ndb replication test directory mysql-test/suite/rpl_ndb/t/rpl_truncate_7ndb_2.test: Fixed wrong path to included tests mysql-test/t/maria-gis-rtree-dynamic.test: Added some check table commands to help pinpoint errors mysql-test/t/maria-mvcc.test: New tests mysql-test/t/maria-purge.test: Remove logs to make test results predictable mysql-test/t/maria.test: New tests for some possible problems mysql-test/t/variables-big.test: Don't log id as it's not predictable mysys/my_handler.c: Updated function comment to reflect old code Changed nextflag to be uint32 to ensure we can have flags > 16 bit Changed checking if we are in insert with NULL keys as next_flag can now include additional bits that have to be ignored. Added SEARCH_INSERT flag to be used when inserting new keys in btree. This flag tells us the that the keys includes row position and it's thus safe to remove SEARCH_FIND Added comparision of transid. This is only done if the keys actually have a transid, which is indicated by nextflag mysys/my_lock.c: Fixed wrong test (Found by Guilhem) scripts/Makefile.am: Ensure that test programs are deleted by make clean sql/rpl_rli.cc: Moved assignment order to fix compiler warning storage/heap/hp_write.c: Add SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys storage/maria/Makefile.am: Remove also maria log files when doing make distclean storage/maria/ha_maria.cc: Use 'file->start_state' as default state for transactional tables without versioning At table unlock, set file->state to point to live state. (Needed for information schema to pick up right number of rows) In ha_maria::implicit_commit() move all locked (ie open) tables to new transaction. This is needed to ensure ha_maria->info doesn't point to a deleted history event. Disable concurrent inserts for insert ... select and table changes with subqueries if statement based replication as this would cause wrong results on slave storage/maria/ma_blockrec.c: Updated comment storage/maria/ma_check.c: Compact key pages (removes transid) when doing --zerofill Check that 'page_flag' on key pages contains KEYPAGE_FLAG_HAS_TRANSID if there is a single key on the page with a transid Modified functions to use MARIA_KEY instead of key pointer and key length Use new interface to _ma_rec_pos(), _ma_dpointer(), _ma_ft_del(), ma_update_state_lsn() Removed not needed argument from get_record_for_key() Fixed that we check doesn't give errors for RTREE; We now treath these like SPATIAL Remove some SPATIAL specific code where the virtual functions can handle this in a general manner Use info->lastkey_buff instead of info->lastkey _ma_dpos() -> _ma_row_pos_from_key() _ma_make_key() -> keyinfo->make_key() _ma_print_key() -> _ma_print_keydata() _ma_move_key() -> ma_copy_copy() Add SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys Ensure that data on page doesn't overwrite page checksum position Use DBUG_DUMP_KEY instead of DBUG_DUMP Use exact key lengths instead of USE_WHOLE_KEY to ha_key_cmp() Fixed check if rowid points outside of BLOCK_RECORD data file Use info->lastkey_buff instead of key on stack in some safe places Added #fdef EXTERNAL_LOCKING around removed state elements storage/maria/ma_close.c: Use keyinfo->root_lock instead of share->key_root_lock[keynr] storage/maria/ma_create.c: Removed assert that is already checked in maria_init() Force transactinal tables to be of type BLOCK_RECORD Fixed wrong usage of HA_PACK_RECORD (should be HA_OPTION_PACK_RECORD) Mark keys that uses HA_KEY_ALG_RTREE with HA_RTREE_INDEX for easier handling of these in ma_check Store max_trid in index file as state.create_trid. This is used to pack all transids in the index pages relative to max possible transid for file. storage/maria/ma_dbug.c: Changed _ma_print_key() to use MARIA_KEY storage/maria/ma_delete.c: Modified functions to use MARIA_KEY instead of key pointer and key length info->lastkey2-> info->lastkey_buff2 Added SEARCH_INSERT to signal ha_key_cmp that we we should also compare rowid for keys Use new interface for get_key(), _ma_get_last_key() and others _ma_dpos() -> ma_row_pos_from_key() Simplify setting of prev_key in del() Ensure that KEYPAGE_FLAG_HAS_TRANSID is set in page_flag if key page has transid Treath key pages that may have a transid as if keys would be of variable length storage/maria/ma_delete_all.c: Reset history state if maria_delete_all_rows() are called Update parameters to _ma_update_state_lsns() call storage/maria/ma_extra.c: Store and restore info->lastkey storage/maria/ma_ft_boolean_search.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_ft_nlq_search.c: Modified functions to use MARIA_KEY instead of key pointer and key length Use lastkey_buff2 instead of info->lastkey+info->s->base.max_key_length (same thing) storage/maria/ma_ft_update.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_ftdefs.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_fulltext.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_init.c: Check if blocksize is legal (Moved test here from ma_open()) storage/maria/ma_key.c: Added functions for storing/reading of transid Modified functions to use MARIA_KEY instead of key pointer and key length Moved _ma_sp_make_key() out of _ma_make_key() as we now use keyinfo->make_key to create keys Add transid to keys if table is versioned Added _ma_copy_key() storage/maria/ma_key_recover.c: Add logging of page_flag (holds information if there are keys with transid on page) Changed DBUG_PRINT("info" -> DBUG_PRINT("redo" as the redo logging can be quite extensive Added lots of DBUG_PRINT() Added support for index page operations: KEY_OP_SET_PAGEFLAG and KEY_OP_COMPACT_PAGE storage/maria/ma_key_recover.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_locking.c: Added new arguments to _ma_update_state_lsns_sub() storage/maria/ma_loghandler.c: Fixed all logging of LSN to look similar in DBUG log Changed if (left != 0) to if (left) as the later is used also later in the code storage/maria/ma_loghandler.h: Added new index page operations storage/maria/ma_open.c: Removed allocated "state_dummy" and instead use share->state.common for transactional tables that are not versioned This is needed to not get double increments of state.records (one in ma_write.c and on when log is written) Changed info->lastkey to MARIA_KEY type Removed resetting of MARIA_HA variables that have 0 as default value (as info is zerofilled) Enable versioning for transactional tables with index. Tables with an auto-increment key, rtree or fulltext keys are not versioned. Check on open that state.create_trid is correct Extend share->base.max_key_length in case of transactional table so that it can hold transid Removed 4.0 compatible fulltext key mode as this is not relevant for Maria Removed old and wrong #ifdef ENABLE_WHEN_WE_HAVE_TRANS_ROW_ID code block Initialize all new virtual function pointers Removed storing of state->unique, state->process and store state->create_trid instead storage/maria/ma_page.c: Added comment to describe key page structure Added functions to compact key page and log the compact operation storage/maria/ma_range.c: Modified functions to use MARIA_KEY instead of key pointer and key length Use SEARCH_PART_KEY indicator instead of USE_WHOLE_KEY to detect if we are doing a part key search Added handling of pages with transid storage/maria/ma_recovery.c: Don't assert if table we opened are not transactional. This may be a table which has been changed from transactional to not transactinal Added new arguments to _ma_update_state_lsns() storage/maria/ma_rename.c: Added new arguments to _ma_update_state_lsns() storage/maria/ma_rkey.c: Modified functions to use MARIA_KEY instead of key pointer and key length Don't use USE_WHOLE_KEY, use real length of key Use share->row_is_visible() to test if row is visible Moved search_flag == HA_READ_KEY_EXACT out of 'read-next-row' loop as this only need to be tested once Removed test if last_used_keyseg != 0 as this is always true storage/maria/ma_rnext.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify code by using local variable keyinfo instead if share->keyinfo[i] Use share->row_is_visible() to test if row is visible storage/maria/ma_rnext_same.c: Modified functions to use MARIA_KEY instead of key pointer and key length lastkey2 -> lastkey_buff2 storage/maria/ma_rprev.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify code by using local variable keyinfo instead if share->keyinfo[i] Use share->row_is_visible() to test if row is visible storage/maria/ma_rsame.c: Updated comment Simplify code by using local variable keyinfo instead if share->keyinfo[i] Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rsamepos.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rt_index.c: Modified functions to use MARIA_KEY instead of key pointer and key length Use better variable names Removed not needed casts _ma_dpos() -> _ma_row_pos_from_key() Use info->last_rtree_keypos to save position to key instead of info->int_keypos Simplify err: condition Changed return type for maria_rtree_insert() to my_bool as we are only intressed in ok/fail from this function storage/maria/ma_rt_index.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rt_key.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify maria_rtree_add_key by combining idenitcal code and removing added_len storage/maria/ma_rt_key.h: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_rt_mbr.c: Changed type of 'nextflag' to uint32 Added 'to' argument to RT_PAGE_MBR_XXX functions to more clearly see which variables changes value storage/maria/ma_rt_mbr.h: Changed type of 'nextflag' to uint32 storage/maria/ma_rt_split.c: Modified functions to use MARIA_KEY instead of key pointer and key length key_length -> key_data_length to catch possible errors storage/maria/ma_rt_test.c: Fixed wrong comment Reset recinfo to avoid valgrind varnings Fixed wrong argument to create_record() that caused test to fail storage/maria/ma_search.c: Modified functions to use MARIA_KEY instead of key pointer and key length Added support of keys with optional trid Test for SEARCH_PART_KEY instead of USE_WHOLE_KEY to detect part key reads _ma_dpos() -> _ma_row_pos_from_key() If there may be keys with transid on the page, have _ma_bin_search() call _ma_seq_search() Add _ma_skip_xxx() functions to quickly step over keys (faster than calling get_key() in most cases as we don't have to copy key data) Combine similar code at end of _ma_get_binary_pack_key() Removed not used function _ma_move_key() In _ma_search_next() don't call _ma_search() if we aren't on a nod page. Update info->cur_row.trid with trid for found key Removed some not needed casts Added _ma_trid_from_key() Use MARIA_SHARE instead of MARIA_HA as arguments to _ma_rec_pos(), _ma_dpointer() and _ma_xxx_keypos_to_recpos() to make functions faster and smaller storage/maria/ma_sort.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_sp_defs.h: _ma_sp_make_key() now fills in and returns (MARIA_KEY *) value storage/maria/ma_sp_key.c: _ma_sp_make_key() now fills in and returns (MARIA_KEY *) value Don't test sizeof(double), test against 8 as we are using float8store() Use mi_float8store() instead of doing swap of value (same thing but faster) storage/maria/ma_state.c: maria_versioning() now only calls _ma_block_get_status() if table supports versioning Added _ma_row_visible_xxx() functions for different occasions When emptying history, set info->state to point to the first history event. storage/maria/ma_state.h: Added _ma_row_visible_xxx() prototypes storage/maria/ma_static.c: Indentation changes storage/maria/ma_statrec.c: Fixed arguments to _ma_dpointer() and _ma_rec_pos() storage/maria/ma_test1.c: Call init_thr_lock() if we have versioning storage/maria/ma_test2.c: Call init_thr_lock() if we have versioning storage/maria/ma_unique.c: Modified functions to use MARIA_KEY storage/maria/ma_update.c: Modified functions to use MARIA_KEY instead of key pointer and key length storage/maria/ma_write.c: Modified functions to use MARIA_KEY instead of key pointer and key length Simplify code by using local variable keyinfo instead if share->keyinfo[i] In _ma_enlarge_root(), mark in page_flag if new key has transid _ma_dpos() -> _ma_row_pos_from_key() Changed return type of _ma_ck_write_tree() to my_bool as we are only testing if result is true or not Moved 'reversed' to outside block as area was used later storage/maria/maria_chk.c: Added error if trying to sort with HA_BINARY_PACK_KEY Use new interface to get_key() and _ma_dpointer() _ma_dpos() -> _ma_row_pos_from_key() storage/maria/maria_def.h: Modified functions to use MARIA_KEY instead of key pointer and key length Added 'common' to MARIA_SHARE->state for storing state for transactional tables without versioning Added create_trid to MARIA_SHARE Removed not used state variables 'process' and 'unique' Added defines for handling TRID's in index pages Changed to use MARIA_SHARE instead of MARIA_HA for some functions Added 'have_versioning' flag if table supports versioning Moved key_root_lock from MARIA_SHARE to MARIA_KEYDEF Changed last_key to be of type MARIA_KEY. Removed lastkey_length lastkey -> lastkey_buff, lastkey2 -> lastkey_buff2 Added _ma_get_used_and_nod_with_flag() for faster access to page data when page_flag is read Added DBUG_DUMP_KEY for easier DBUG_DUMP of a key Changed 'nextflag' and assocaited variables to uint32 storage/maria/maria_ftdump.c: lastkey -> lastkey_buff storage/maria/trnman.c: Fixed wrong initialization of min_read_from and max_commit_trid Added trnman_get_min_safe_trid() storage/maria/unittest/ma_test_all-t: Added --start-from storage/myisam/mi_check.c: Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order storage/myisam/mi_delete.c: Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order storage/myisam/mi_range.c: Updated comment storage/myisam/mi_write.c: Added SEARCH_INSERT, as ha_key_cmp() needs it when doing key comparision for inserting key on page in rowid order storage/myisam/rt_index.c: Fixed wrong parameter to rtree_get_req() which could cause crash
2008-06-26 07:18:28 +02:00
uchar reversed[MARIA_MAX_KEY_BUFF];
WL#3072 - Maria Recovery: recovery of state.auto_increment. When we log UNDO_KEY_INSERT for an auto_inc key, we update state.auto_increment (not anymore at the end of maria_write() except if this is a non-transactional table). When Recovery sees UNDO_KEY_INSERT in the REDO phase, it reads the auto_inc value from it and updates state.auto_increment. mysql-test/r/maria-recovery.result: Without the code fix, there would be in CHECK TABLE: "Auto-increment value: 0 is smaller than max used value: 3" and no AUTO_INCREMENT= clause in SHOW CREATE TABLE. mysql-test/t/maria-recovery.test: Test of recovery of state.auto_increment: from an old table, does the replaying of the log set state.auto_increment to what it should be. storage/maria/ma_check.c: new way of calling ma_retrieve_auto_increment(): pass key storage/maria/ma_key.c: ma_retrieve_auto_increment() now operates directly with a pointer to the key and not on the record. storage/maria/ma_key_recover.c: dedicated write_hook_for_undo_key_insert(): sets state.auto_increment under log's mutex. storage/maria/ma_key_recover.h: Dedicated hook for UNDO_KEY_INSERT, to set state.auto_increment. Such hook needs a new member st_msg_write_hook_for_undo_key::auto_increment, which contains the auto_increment value inserted. storage/maria/ma_loghandler.c: UNDO_KEY_INSERT gets a dedicated write_hook, to set auto_increment. storage/maria/ma_recovery.c: When in the REDO phase we see UNDO_KEY_INSERT: if the state is older than this record, and the key is the auto_increment one, read the key's value from the log record and update state.auto_increment. storage/maria/ma_test_all.sh: use $maria_path to be able to run from /dev/shm (faster) storage/maria/ma_update.c: bool is more of C++, using my_bool. If table is transactional, state.auto_increment is already updated in write_hook_for_undo_key_insert(). storage/maria/ma_write.c: If table is transactional, state.auto_increment is not updated at the end of maria_write() but rather in write_hook_for_undo_key_insert() (under log's mutex, so that a concurrent checkpoint does not read state.auto_increment while it is changing - corrupted). _ma_ck_write_btree_with_log() extracts the auto_increment value from the key, puts it into msg.auto_increment, and this is passed to write_hook_for_undo_key_insert(). storage/maria/maria_def.h: change of prototype of ma_retrieve_auto_increment() storage/maria/maria_read_log.c: use default log file size. Use separate page caches for table and logs (needed if maria_block_size!=TRANSLOG_PAGE_SIZE).
2007-12-12 22:33:36 +01:00
uchar *key_ptr= to;
uchar *key_end= key_ptr + keyseg->length;
to= reversed + keyseg->length;
do
{
*--to= *key_ptr++;
} while (key_ptr != key_end);
/* ... so that we can read it with: */
}
value= ma_retrieve_auto_increment(to, keyseg->type);
set_if_bigger(share->state.auto_increment, value);
llstr(share->state.auto_increment, llbuf);
tprint(tracef, " auto-inc %s\n", llbuf);
}
}
_ma_unpin_all_pages(info, rec->lsn);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 0;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_redo_exec_hook(UNDO_KEY_DELETE)
{
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
MARIA_HA *info;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
set_undo_lsn_for_active_trans(rec->short_trid, rec->lsn);
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
if (!(info= get_MARIA_HA_from_UNDO_record(rec)))
return 0;
_ma_unpin_all_pages(info, rec->lsn);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 0;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_redo_exec_hook(UNDO_KEY_DELETE_WITH_ROOT)
{
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
MARIA_SHARE *share;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
set_undo_lsn_for_active_trans(rec->short_trid, rec->lsn);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
if (info == NULL)
return 0;
share= info->s;
if (cmp_translog_addr(rec->lsn, share->state.is_of_horizon) >= 0)
{
uint key_nr;
my_off_t page;
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
key_nr= key_nr_korr(rec->header + LSN_STORE_SIZE + FILEID_STORE_SIZE);
page= page_korr(rec->header + LSN_STORE_SIZE + FILEID_STORE_SIZE +
KEY_NR_STORE_SIZE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share->state.key_root[key_nr]= (page == IMPOSSIBLE_PAGE_NO ?
HA_OFFSET_ERROR :
page * share->block_size);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
_ma_unpin_all_pages(info, rec->lsn);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
return 0;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
prototype_redo_exec_hook(UNDO_BULK_INSERT)
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
{
/*
If the repair finished it wrote and sync the state. If it didn't finish,
we are going to empty the table and that will fix the state.
*/
set_undo_lsn_for_active_trans(rec->short_trid, rec->lsn);
return 0;
}
Fix for BUG#37876 "Importing Maria table from other server via binary copy does not work": - after auto-zerofill (ha_maria::check_and_repair()) kepts its state's LSNs unchanged, which could be the same as the create_rename_lsn of another pre-existing table, which would break versioning as this LSN serves as unique identifier in the versioning code (in maria_open()). Even the state pieces which maria_zerofill() did change were lost (because they didn't go to disk). - after this fix, if two tables were auto-zerofilled at the same time (by _ma_mark_changed()) they could receive the same create_rename_lsn, which would break versioning again. Fix is to write a log record each time a table is imported. - Print state's LSNs (create_rename_lsn, is_of_horizon, skip_redo_lsn) and UUID in maria_chk -dvv. mysql-test/r/maria-autozerofill.result: result mysql-test/t/maria-autozerofill.test: Test for auto-zerofilling storage/maria/ha_maria.cc: The state changes done by auto-zerofilling never reached disk. storage/maria/ma_check.c: When zerofilling a table, including its pages' LSNs, new state LSNs are needed next time the table is imported into a Maria instance. storage/maria/ma_create.c: Write LOGREC_IMPORTED_TABLE when importing a table. This is informative and ensures that the table gets a unique create_rename_lsn even though multiple tables are imported by concurrent threads (it advances the log's end LSN). storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: instead of using translog_get_horizon() for state's LSNs of imported table, use the LSN of to-be-written LOGREC_IMPORTED_TABLE. storage/maria/ma_loghandler.c: New type of log record storage/maria/ma_loghandler.h: New type of log record storage/maria/ma_loghandler_lsn.h: New name for constant as can be used not only by maria_chk but auto-zerofill now too. storage/maria/ma_open.c: instead of using translog_get_horizon() for state's LSNs of imported table, use the LSN of to-be-written LOGREC_IMPORTED_TABLE. storage/maria/ma_recovery.c: print content of LOGREC_IMPORTED_TABLE in maria_read_log. storage/maria/maria_chk.c: print info about LSNs of the table's state, and UUID, when maria_chk -dvv storage/maria/maria_pack.c: new name for constant storage/maria/unittest/ma_test_recovery.pl: Now that maria_chk -dvv shows state LSNs and UUID those need to be filtered out, as maria_read_log -a does not use the same as at original run.
2008-07-09 11:02:27 +02:00
prototype_redo_exec_hook(IMPORTED_TABLE)
{
char *name;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
eprint(tracef, "Failed to read record");
return 1;
}
name= (char *)log_record_buffer.str;
tprint(tracef, "Table '%s' was imported (auto-zerofilled) in this Maria instance\n", name);
return 0;
}
prototype_redo_exec_hook(COMMIT)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
uint16 sid= rec->short_trid;
TrID long_trid= all_active_trans[sid].long_trid;
char llbuf[22];
if (long_trid == 0)
{
tprint(tracef, "We don't know about transaction with short_trid %u;"
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
"it probably committed long ago, forget it\n", sid);
WL#3072 - Maria recovery. * fix for bitmap vs checkpoint bug which could lead to corrupted tables in case of crashes at certain moments: a bitmap could be flushed to disk even though it was inconsistent with the log (it could be flushed before REDO-UNDO are written to the log). One bug remains, need code from others. Tests added. Fix is to pin unflushable bitmap pages, and let checkpoint wait for them to be flushable. * fix for long_trid!=0 assertion failure at Recovery. * less useless wakeups in the background flush|checkpoint thread. * store global_trid_generator in checkpoint record. mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery.test: make it easier to locate subtests storage/maria/ma_bitmap.c: When we send a bitmap to the pagecache, if this bitmap is not in a flushable state we keep it pinned and add it to a list, it will be unpinned when the bitmap is flushable again. A new function _ma_bitmap_flush_all() used by checkpoint. A new function _ma_bitmap_flushable() used by block format to signal when it starts modifying a bitmap and when it is done with it. storage/maria/ma_blockrec.c: When starting a row operation (insert/update/delete), mark that the bitmap is not flushable (because for example INSERT is going to over-allocate in the bitmap to prevent other threads from using our data pages). If a checkpoint comes at this moment it will wait for the bitmap to be flushable before flushing it. When the operation ends, bitmap becomes flushable again; that transition is done under the bitmap's mutex (needed for correct synchro with a concurrent checkpoint); but for INSERT/UPDATE this happens inside _ma_bitmap_release_unused() at a place where it already has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO of INSERT. In case of errors after setting the bitmap unflushable, we must always set it back to flushable or checkpoint would block. Debug possibilities to force a sleep while the bitmap is over-allocated. In case of error in get_head_or_tail() in allocate_and_write_block_record(), we still need to unpin all pages. Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong data_file_length. storage/maria/ma_blockrec.h: new bitmap calls. storage/maria/ma_checkpoint.c: filter_flush_indirect not needed anymore (flushing bitmap pages happens in _ma_bitmap_flush_all() now). So st_filter_param::is_data_file|pages_covered_by_bitmap not needed. Other filter_flush* don't need to flush bitmap anymore. Add debug possibility to flush all bitmap pages outside of a checkpoint, to simulate pagecache LRU eviction. When the background flush/checkpoint thread notices it has nothing to flush, it now sleeps directly until the next potential checkpoint moment instead of waking up every second. When in checkpoint we decide to not store a table in the checkpoint record (because it has logged no writes for example), we can also skip flushing this table. storage/maria/ma_commit.c: comment is out-of-date storage/maria/ma_key_recover.c: comment fix storage/maria/ma_loghandler.c: comment is out-of-date storage/maria/ma_open.c: comment is out-of-date storage/maria/ma_pagecache.c: comment for bug to fix. And we don't take checkpoints at end of REDO phase yet so can trust block->type. storage/maria/ma_recovery.c: Comments. Now-unneeded code for incomplete REDO-UNDO groups removed. When we forget about an old transaction we must really forget about it with bzero() (fixes the "long_trid!=0 assertion" recovery bug). When we delete a row with maria_delete() we turn on STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both cases a row was deleted). Pick up max_long_trid from the checkpoint record. storage/maria/maria_chk.c: comment storage/maria/maria_def.h: MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and 'pinned_pages'. storage/maria/trnman.c: I used to think that recovery only needs to know the maximum TrID of the lists of active and committed transactions. But no, sometimes both lists can even be empty and their TrID should not be reused. So Checkpoint now saves global_trid_generator in the checkpoint record. storage/maria/trnman_public.h: macros to read/store a TrID mysql-test/r/maria-recovery-bitmap.result: result is ok. Without the code fix, we would get a corruption message about the bitmap page in CHECK TABLE EXTENDED. mysql-test/t/maria-recovery-bitmap-master.opt: usual when we crash mysqld in tests mysql-test/t/maria-recovery-bitmap.test: test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
bzero(&all_active_trans[sid], sizeof(all_active_trans[sid]));
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
return 0;
}
llstr(long_trid, llbuf);
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
tprint(tracef, "Transaction long_trid %s short_trid %u committed\n",
llbuf, sid);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
bzero(&all_active_trans[sid], sizeof(all_active_trans[sid]));
#ifdef MARIA_VERSIONING
/*
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if real recovery:
transaction was committed, move it to some separate list for later
purging (but don't purge now! purging may have been started before, we
may find REDO_PURGE records soon).
*/
#endif
return 0;
}
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
prototype_redo_exec_hook(CLR_END)
{
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
LSN previous_undo_lsn;
enum translog_record_type undone_record_type;
const LOG_DESC *log_desc;
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
my_bool row_entry= 0;
Fixed bug in undo_key_delete; Caused crashed key files in recovery Maria is now used for internal temporary tables in MySQL Better usage of VARCHAR and long strings in temporary tables Use packed fields if BLOCK_RECORD is used null_bytes are not anymore stored in a separate field New interface to remember and restore scan position Fixed bugs in unique handling Don't sync Maria temporary tables Lock control file while it's used to stop several processes from using it Changed value of MA_DONT_OVERWRITE_FILE as it collided with MY_SYNC_DIR Split MY_DONT_WAIT into MY_NO_WAIT and MY_SHORT_WAIT (for my_lock()) Added MY_FORCE_LOCK include/my_sys.h: Changed value of MA_DONT_OVERWRITE_FILE as it collided with MY_SYNC_DIR Split MY_DONT_WAIT into MY_NO_WAIT and MY_SHORT_WAIT (for my_lock()) Added MY_FORCE_LOCK include/myisam.h: Make MyISAM columndef compile time compatible with Maria mysql-test/lib/mtr_process.pl: Removed confusing warning (It's common that there is a lot of other files than pid files) mysql-test/mysql-test-run.pl: Added --sync-frm to speed up tests mysql-test/r/maria-recovery.result: Updated results from wrong push mysql-test/suite/rpl/t/rpl_innodb_bug28430.test: Marked test as --big mysys/my_lock.c: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) mysys/my_thr_init.c: Fix that we don't give name to thread before it's properly initied sql/handler.cc: Added myisam.h sql/handler.h: Changes to use Maria for internal temporary tables Removed not needed argument to restart_rnd_next() Added function remember_rnd_pos() sql/my_lock.c: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) sql/mysql_priv.h: Added maria_hton sql/sql_class.h: Changes to use Maria for internal temporary tables sql/sql_select.cc: Changes to use Maria for internal temporary tables Temporary tables didn't properly switch to dynamic row format if long strings was used Better usage of VARCHAR in temporary tables Use new interface to restart scan in duplicate removal sql/sql_select.h: Changes to use Maria for internal temporary tables sql/sql_show.cc: Changes to use Maria for internal temporary tables Removed all end space sql/sql_table.cc: Set HA_OPTION_PACK_RECORD if we are not using default or static record sql/sql_union.cc: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) sql/sql_update.cc: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) storage/maria/ha_maria.cc: Use packed fields null_bytes are not anymore stored in a separate field Changes to use Maria for internal temporary tables Give warning if we try to do an ALTER TABLE to a unusable row format storage/maria/ha_maria.h: Allow Maria with block format to restart scanning at given position storage/maria/ma_blockrec.c: Added functions to remember and restore scan position Allocate cur_row.extents so that we don't have to do a malloc on first read Fixed bug when using packed row without packed strings Removed unneeded calls to free_full_pages() Fixed unlikely bug when using old bitmap to read head page and head page had gone away Remember row position when doing undo of delete and update row (needed for undo of key delete) storage/maria/ma_blockrec.h: Added functions to remember and restore scan position storage/maria/ma_close.c: Don't sync temporary tables storage/maria/ma_control_file.c: Lock control file while it's used to stop several processes from using it storage/maria/ma_create.c: Fixed bug when using FIELD_NORMAL that was longer than FULL_PAGE_SIZE Fixed bug that casued fields to not be ordered according to offset Fixed bug in unique creation storage/maria/ma_delete.c: Don't write record reference when deleting key. (Rowid is likely to be different when we undo this) storage/maria/ma_dynrec.c: Fixed core dump when comparing records (happended in unique handling) storage/maria/ma_extra.c: MY_DONT_WAIT -> MY_SHORT_WAIT Removed TODO comment. (Was not relevant as all other instances are guranteed to be closed when we the code is excecuted) Added DBUG_ASSERT() to prove above. storage/maria/ma_key_recover.c: CLR's for UNDO_ROW_DELETE and UNDO_ROW_UPDATE now include rowid for the row. This was needed for undo_key_delete to work, as undo of delete row is likely to put row in a new position. undo_delete_key now doesn't include row position storage/maria/ma_open.c: Added virtual functions for remembering and restoring scan position Fixed wrong key search method when using multi-byte character sets (Bug#32705) Store original column number in index file NOTE: Index files are now incompatible with previous versions! (Ok as we haven't yet made a public Maria release) storage/maria/ma_recovery.c: Set info->cur_row.lastpos when reading CLR's for UNDO_ROW_DELETE or UNDO_ROW_UPDATE storage/maria/ma_scan.c: Added default function to remember and restore scan position storage/maria/maria_def.h: Added virtual functions & variables to remember and restore scan position Added MARIA_MAX_CONTROL_FILE_LOCK_RETRY storage/myisam/ha_myisam.cc: Fixed compiler errors as columdef->type is now an enum, not an integer Added functions to remember and restore scan position storage/myisam/ha_myisam.h: Added functions to remember and restore scan position storage/myisam/mi_check.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/mi_extra.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/mi_open.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/myisamdef.h: MY_DONT_WAIT -> MY_SHORT_WAIT
2007-12-17 00:17:37 +01:00
uchar *logpos;
Fixed repair_by_sort to work with BLOCK_RECORD Fixed bugs in undo logging Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) Reserved place for reference-transid on key pages (for packing of transids) ALTER TABLE and INSERT ... SELECT now uses fast creation of index Known bugs: ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix) ma_test_recovery.excepted is not totally correct, because of the above bug mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate include/maria.h: Prototype changes Added current_filepos to st_maria_sort_info mysql-test/r/maria.result: Updated results that changes as alter table and insert ... select now uses fast creation of index mysys/mf_iocache.c: Reset variable to gurard against double invocation storage/maria/ma_bitmap.c: Added _ma_bitmap_reset_cache() (needed for repair) storage/maria/ma_blockrec.c: Simplify code More initial allocations Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) storage/maria/ma_blockrec.h: Moved TRANSID_SIZE to maria_def.h Added prototype for new functions storage/maria/ma_check.c: Simplicy code Fixed repair_by_sort to work with BLOCK_RECORD - When using BLOCK_RECORD or UNPACK create new Maria handle - Use common initializer function - Align code with maria_repair() Made some changes to maria_repair_parallel() to use common initializer function Removed ASK_MONTY section by fixing noted problem storage/maria/ma_close.c: Moved check for readonly to _ma_state_info_write() storage/maria/ma_key_recover.c: Use different log entries if key root changes or not. This fixed some bugs when tree grows storage/maria/ma_key_recover.h: Added keynr to st_msg_to_write_hook_for_undo_key storage/maria/ma_loghandler.c: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_loghandler.h: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_open.c: Added TRANSID to all key pages (for future compressing of trans id's) For compressed records, alloc a bit bigger buffer to avoid valgrind warnings If table is opened readonly, don't update state storage/maria/ma_packrec.c: Allocate bigger array for bit unpacking to avoid valgrind errors storage/maria/ma_recovery.c: Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_sort.c: More logging storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Update results Note that this is not complete becasue of a bug in recovery storage/maria/ma_test_recovery: Removed recreation of index (not needed when we have redo for index pages) storage/maria/maria_chk.c: When using flag --read-only, don't update status for files When using --unpack, don't use REPAIR_BY_SORT if other repair option is given Enable repair_by_sort for BLOCK records Removed not needed newline at start of --describe storage/maria/maria_def.h: Support for TRANSID_SIZE to key pages storage/maria/maria_read_log.c: renamed --only-display to --display-only
2007-11-28 20:38:30 +01:00
DBUG_ENTER("exec_REDO_LOGREC_CLR_END");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
previous_undo_lsn= lsn_korr(rec->header);
undone_record_type=
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
clr_type_korr(rec->header + LSN_STORE_SIZE + FILEID_STORE_SIZE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
log_desc= &log_record_type_descriptor[undone_record_type];
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
set_undo_lsn_for_active_trans(rec->short_trid, previous_undo_lsn);
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
if (info == NULL)
DBUG_RETURN(0);
share= info->s;
tprint(tracef, " CLR_END was about %s, undo_lsn now LSN (%lu,0x%lx)\n",
log_desc->name, LSN_IN_PARTS(previous_undo_lsn));
Fixed bug in undo_key_delete; Caused crashed key files in recovery Maria is now used for internal temporary tables in MySQL Better usage of VARCHAR and long strings in temporary tables Use packed fields if BLOCK_RECORD is used null_bytes are not anymore stored in a separate field New interface to remember and restore scan position Fixed bugs in unique handling Don't sync Maria temporary tables Lock control file while it's used to stop several processes from using it Changed value of MA_DONT_OVERWRITE_FILE as it collided with MY_SYNC_DIR Split MY_DONT_WAIT into MY_NO_WAIT and MY_SHORT_WAIT (for my_lock()) Added MY_FORCE_LOCK include/my_sys.h: Changed value of MA_DONT_OVERWRITE_FILE as it collided with MY_SYNC_DIR Split MY_DONT_WAIT into MY_NO_WAIT and MY_SHORT_WAIT (for my_lock()) Added MY_FORCE_LOCK include/myisam.h: Make MyISAM columndef compile time compatible with Maria mysql-test/lib/mtr_process.pl: Removed confusing warning (It's common that there is a lot of other files than pid files) mysql-test/mysql-test-run.pl: Added --sync-frm to speed up tests mysql-test/r/maria-recovery.result: Updated results from wrong push mysql-test/suite/rpl/t/rpl_innodb_bug28430.test: Marked test as --big mysys/my_lock.c: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) mysys/my_thr_init.c: Fix that we don't give name to thread before it's properly initied sql/handler.cc: Added myisam.h sql/handler.h: Changes to use Maria for internal temporary tables Removed not needed argument to restart_rnd_next() Added function remember_rnd_pos() sql/my_lock.c: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) sql/mysql_priv.h: Added maria_hton sql/sql_class.h: Changes to use Maria for internal temporary tables sql/sql_select.cc: Changes to use Maria for internal temporary tables Temporary tables didn't properly switch to dynamic row format if long strings was used Better usage of VARCHAR in temporary tables Use new interface to restart scan in duplicate removal sql/sql_select.h: Changes to use Maria for internal temporary tables sql/sql_show.cc: Changes to use Maria for internal temporary tables Removed all end space sql/sql_table.cc: Set HA_OPTION_PACK_RECORD if we are not using default or static record sql/sql_union.cc: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) sql/sql_update.cc: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) storage/maria/ha_maria.cc: Use packed fields null_bytes are not anymore stored in a separate field Changes to use Maria for internal temporary tables Give warning if we try to do an ALTER TABLE to a unusable row format storage/maria/ha_maria.h: Allow Maria with block format to restart scanning at given position storage/maria/ma_blockrec.c: Added functions to remember and restore scan position Allocate cur_row.extents so that we don't have to do a malloc on first read Fixed bug when using packed row without packed strings Removed unneeded calls to free_full_pages() Fixed unlikely bug when using old bitmap to read head page and head page had gone away Remember row position when doing undo of delete and update row (needed for undo of key delete) storage/maria/ma_blockrec.h: Added functions to remember and restore scan position storage/maria/ma_close.c: Don't sync temporary tables storage/maria/ma_control_file.c: Lock control file while it's used to stop several processes from using it storage/maria/ma_create.c: Fixed bug when using FIELD_NORMAL that was longer than FULL_PAGE_SIZE Fixed bug that casued fields to not be ordered according to offset Fixed bug in unique creation storage/maria/ma_delete.c: Don't write record reference when deleting key. (Rowid is likely to be different when we undo this) storage/maria/ma_dynrec.c: Fixed core dump when comparing records (happended in unique handling) storage/maria/ma_extra.c: MY_DONT_WAIT -> MY_SHORT_WAIT Removed TODO comment. (Was not relevant as all other instances are guranteed to be closed when we the code is excecuted) Added DBUG_ASSERT() to prove above. storage/maria/ma_key_recover.c: CLR's for UNDO_ROW_DELETE and UNDO_ROW_UPDATE now include rowid for the row. This was needed for undo_key_delete to work, as undo of delete row is likely to put row in a new position. undo_delete_key now doesn't include row position storage/maria/ma_open.c: Added virtual functions for remembering and restoring scan position Fixed wrong key search method when using multi-byte character sets (Bug#32705) Store original column number in index file NOTE: Index files are now incompatible with previous versions! (Ok as we haven't yet made a public Maria release) storage/maria/ma_recovery.c: Set info->cur_row.lastpos when reading CLR's for UNDO_ROW_DELETE or UNDO_ROW_UPDATE storage/maria/ma_scan.c: Added default function to remember and restore scan position storage/maria/maria_def.h: Added virtual functions & variables to remember and restore scan position Added MARIA_MAX_CONTROL_FILE_LOCK_RETRY storage/myisam/ha_myisam.cc: Fixed compiler errors as columdef->type is now an enum, not an integer Added functions to remember and restore scan position storage/myisam/ha_myisam.h: Added functions to remember and restore scan position storage/myisam/mi_check.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/mi_extra.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/mi_open.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/myisamdef.h: MY_DONT_WAIT -> MY_SHORT_WAIT
2007-12-17 00:17:37 +01:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
Fixed bug in undo_key_delete; Caused crashed key files in recovery Maria is now used for internal temporary tables in MySQL Better usage of VARCHAR and long strings in temporary tables Use packed fields if BLOCK_RECORD is used null_bytes are not anymore stored in a separate field New interface to remember and restore scan position Fixed bugs in unique handling Don't sync Maria temporary tables Lock control file while it's used to stop several processes from using it Changed value of MA_DONT_OVERWRITE_FILE as it collided with MY_SYNC_DIR Split MY_DONT_WAIT into MY_NO_WAIT and MY_SHORT_WAIT (for my_lock()) Added MY_FORCE_LOCK include/my_sys.h: Changed value of MA_DONT_OVERWRITE_FILE as it collided with MY_SYNC_DIR Split MY_DONT_WAIT into MY_NO_WAIT and MY_SHORT_WAIT (for my_lock()) Added MY_FORCE_LOCK include/myisam.h: Make MyISAM columndef compile time compatible with Maria mysql-test/lib/mtr_process.pl: Removed confusing warning (It's common that there is a lot of other files than pid files) mysql-test/mysql-test-run.pl: Added --sync-frm to speed up tests mysql-test/r/maria-recovery.result: Updated results from wrong push mysql-test/suite/rpl/t/rpl_innodb_bug28430.test: Marked test as --big mysys/my_lock.c: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) mysys/my_thr_init.c: Fix that we don't give name to thread before it's properly initied sql/handler.cc: Added myisam.h sql/handler.h: Changes to use Maria for internal temporary tables Removed not needed argument to restart_rnd_next() Added function remember_rnd_pos() sql/my_lock.c: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) sql/mysql_priv.h: Added maria_hton sql/sql_class.h: Changes to use Maria for internal temporary tables sql/sql_select.cc: Changes to use Maria for internal temporary tables Temporary tables didn't properly switch to dynamic row format if long strings was used Better usage of VARCHAR in temporary tables Use new interface to restart scan in duplicate removal sql/sql_select.h: Changes to use Maria for internal temporary tables sql/sql_show.cc: Changes to use Maria for internal temporary tables Removed all end space sql/sql_table.cc: Set HA_OPTION_PACK_RECORD if we are not using default or static record sql/sql_union.cc: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) sql/sql_update.cc: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) storage/maria/ha_maria.cc: Use packed fields null_bytes are not anymore stored in a separate field Changes to use Maria for internal temporary tables Give warning if we try to do an ALTER TABLE to a unusable row format storage/maria/ha_maria.h: Allow Maria with block format to restart scanning at given position storage/maria/ma_blockrec.c: Added functions to remember and restore scan position Allocate cur_row.extents so that we don't have to do a malloc on first read Fixed bug when using packed row without packed strings Removed unneeded calls to free_full_pages() Fixed unlikely bug when using old bitmap to read head page and head page had gone away Remember row position when doing undo of delete and update row (needed for undo of key delete) storage/maria/ma_blockrec.h: Added functions to remember and restore scan position storage/maria/ma_close.c: Don't sync temporary tables storage/maria/ma_control_file.c: Lock control file while it's used to stop several processes from using it storage/maria/ma_create.c: Fixed bug when using FIELD_NORMAL that was longer than FULL_PAGE_SIZE Fixed bug that casued fields to not be ordered according to offset Fixed bug in unique creation storage/maria/ma_delete.c: Don't write record reference when deleting key. (Rowid is likely to be different when we undo this) storage/maria/ma_dynrec.c: Fixed core dump when comparing records (happended in unique handling) storage/maria/ma_extra.c: MY_DONT_WAIT -> MY_SHORT_WAIT Removed TODO comment. (Was not relevant as all other instances are guranteed to be closed when we the code is excecuted) Added DBUG_ASSERT() to prove above. storage/maria/ma_key_recover.c: CLR's for UNDO_ROW_DELETE and UNDO_ROW_UPDATE now include rowid for the row. This was needed for undo_key_delete to work, as undo of delete row is likely to put row in a new position. undo_delete_key now doesn't include row position storage/maria/ma_open.c: Added virtual functions for remembering and restoring scan position Fixed wrong key search method when using multi-byte character sets (Bug#32705) Store original column number in index file NOTE: Index files are now incompatible with previous versions! (Ok as we haven't yet made a public Maria release) storage/maria/ma_recovery.c: Set info->cur_row.lastpos when reading CLR's for UNDO_ROW_DELETE or UNDO_ROW_UPDATE storage/maria/ma_scan.c: Added default function to remember and restore scan position storage/maria/maria_def.h: Added virtual functions & variables to remember and restore scan position Added MARIA_MAX_CONTROL_FILE_LOCK_RETRY storage/myisam/ha_myisam.cc: Fixed compiler errors as columdef->type is now an enum, not an integer Added functions to remember and restore scan position storage/myisam/ha_myisam.h: Added functions to remember and restore scan position storage/myisam/mi_check.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/mi_extra.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/mi_open.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/myisamdef.h: MY_DONT_WAIT -> MY_SHORT_WAIT
2007-12-17 00:17:37 +01:00
return 1;
}
logpos= (log_record_buffer.str + LSN_STORE_SIZE + FILEID_STORE_SIZE +
CLR_TYPE_STORE_SIZE);
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
if (cmp_translog_addr(rec->lsn, share->state.is_of_horizon) >= 0)
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
{
tprint(tracef, " state older than record\n");
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
switch (undone_record_type) {
case LOGREC_UNDO_ROW_DELETE:
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
row_entry= 1;
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
share->state.state.records++;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
break;
case LOGREC_UNDO_ROW_INSERT:
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
share->state.state.records--;
WL#3072 - Maria recovery. * fix for bitmap vs checkpoint bug which could lead to corrupted tables in case of crashes at certain moments: a bitmap could be flushed to disk even though it was inconsistent with the log (it could be flushed before REDO-UNDO are written to the log). One bug remains, need code from others. Tests added. Fix is to pin unflushable bitmap pages, and let checkpoint wait for them to be flushable. * fix for long_trid!=0 assertion failure at Recovery. * less useless wakeups in the background flush|checkpoint thread. * store global_trid_generator in checkpoint record. mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery.test: make it easier to locate subtests storage/maria/ma_bitmap.c: When we send a bitmap to the pagecache, if this bitmap is not in a flushable state we keep it pinned and add it to a list, it will be unpinned when the bitmap is flushable again. A new function _ma_bitmap_flush_all() used by checkpoint. A new function _ma_bitmap_flushable() used by block format to signal when it starts modifying a bitmap and when it is done with it. storage/maria/ma_blockrec.c: When starting a row operation (insert/update/delete), mark that the bitmap is not flushable (because for example INSERT is going to over-allocate in the bitmap to prevent other threads from using our data pages). If a checkpoint comes at this moment it will wait for the bitmap to be flushable before flushing it. When the operation ends, bitmap becomes flushable again; that transition is done under the bitmap's mutex (needed for correct synchro with a concurrent checkpoint); but for INSERT/UPDATE this happens inside _ma_bitmap_release_unused() at a place where it already has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO of INSERT. In case of errors after setting the bitmap unflushable, we must always set it back to flushable or checkpoint would block. Debug possibilities to force a sleep while the bitmap is over-allocated. In case of error in get_head_or_tail() in allocate_and_write_block_record(), we still need to unpin all pages. Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong data_file_length. storage/maria/ma_blockrec.h: new bitmap calls. storage/maria/ma_checkpoint.c: filter_flush_indirect not needed anymore (flushing bitmap pages happens in _ma_bitmap_flush_all() now). So st_filter_param::is_data_file|pages_covered_by_bitmap not needed. Other filter_flush* don't need to flush bitmap anymore. Add debug possibility to flush all bitmap pages outside of a checkpoint, to simulate pagecache LRU eviction. When the background flush/checkpoint thread notices it has nothing to flush, it now sleeps directly until the next potential checkpoint moment instead of waking up every second. When in checkpoint we decide to not store a table in the checkpoint record (because it has logged no writes for example), we can also skip flushing this table. storage/maria/ma_commit.c: comment is out-of-date storage/maria/ma_key_recover.c: comment fix storage/maria/ma_loghandler.c: comment is out-of-date storage/maria/ma_open.c: comment is out-of-date storage/maria/ma_pagecache.c: comment for bug to fix. And we don't take checkpoints at end of REDO phase yet so can trust block->type. storage/maria/ma_recovery.c: Comments. Now-unneeded code for incomplete REDO-UNDO groups removed. When we forget about an old transaction we must really forget about it with bzero() (fixes the "long_trid!=0 assertion" recovery bug). When we delete a row with maria_delete() we turn on STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both cases a row was deleted). Pick up max_long_trid from the checkpoint record. storage/maria/maria_chk.c: comment storage/maria/maria_def.h: MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and 'pinned_pages'. storage/maria/trnman.c: I used to think that recovery only needs to know the maximum TrID of the lists of active and committed transactions. But no, sometimes both lists can even be empty and their TrID should not be reused. So Checkpoint now saves global_trid_generator in the checkpoint record. storage/maria/trnman_public.h: macros to read/store a TrID mysql-test/r/maria-recovery-bitmap.result: result is ok. Without the code fix, we would get a corruption message about the bitmap page in CHECK TABLE EXTENDED. mysql-test/t/maria-recovery-bitmap-master.opt: usual when we crash mysqld in tests mysql-test/t/maria-recovery-bitmap.test: test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
share->state.changed|= STATE_NOT_OPTIMIZED_ROWS;
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
row_entry= 1;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
break;
case LOGREC_UNDO_ROW_UPDATE:
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
row_entry= 1;
break;
case LOGREC_UNDO_KEY_INSERT:
case LOGREC_UNDO_KEY_DELETE:
break;
Fixed repair_by_sort to work with BLOCK_RECORD Fixed bugs in undo logging Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) Reserved place for reference-transid on key pages (for packing of transids) ALTER TABLE and INSERT ... SELECT now uses fast creation of index Known bugs: ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix) ma_test_recovery.excepted is not totally correct, because of the above bug mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate include/maria.h: Prototype changes Added current_filepos to st_maria_sort_info mysql-test/r/maria.result: Updated results that changes as alter table and insert ... select now uses fast creation of index mysys/mf_iocache.c: Reset variable to gurard against double invocation storage/maria/ma_bitmap.c: Added _ma_bitmap_reset_cache() (needed for repair) storage/maria/ma_blockrec.c: Simplify code More initial allocations Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) storage/maria/ma_blockrec.h: Moved TRANSID_SIZE to maria_def.h Added prototype for new functions storage/maria/ma_check.c: Simplicy code Fixed repair_by_sort to work with BLOCK_RECORD - When using BLOCK_RECORD or UNPACK create new Maria handle - Use common initializer function - Align code with maria_repair() Made some changes to maria_repair_parallel() to use common initializer function Removed ASK_MONTY section by fixing noted problem storage/maria/ma_close.c: Moved check for readonly to _ma_state_info_write() storage/maria/ma_key_recover.c: Use different log entries if key root changes or not. This fixed some bugs when tree grows storage/maria/ma_key_recover.h: Added keynr to st_msg_to_write_hook_for_undo_key storage/maria/ma_loghandler.c: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_loghandler.h: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_open.c: Added TRANSID to all key pages (for future compressing of trans id's) For compressed records, alloc a bit bigger buffer to avoid valgrind warnings If table is opened readonly, don't update state storage/maria/ma_packrec.c: Allocate bigger array for bit unpacking to avoid valgrind errors storage/maria/ma_recovery.c: Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_sort.c: More logging storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Update results Note that this is not complete becasue of a bug in recovery storage/maria/ma_test_recovery: Removed recreation of index (not needed when we have redo for index pages) storage/maria/maria_chk.c: When using flag --read-only, don't update status for files When using --unpack, don't use REPAIR_BY_SORT if other repair option is given Enable repair_by_sort for BLOCK records Removed not needed newline at start of --describe storage/maria/maria_def.h: Support for TRANSID_SIZE to key pages storage/maria/maria_read_log.c: renamed --only-display to --display-only
2007-11-28 20:38:30 +01:00
case LOGREC_UNDO_KEY_INSERT_WITH_ROOT:
case LOGREC_UNDO_KEY_DELETE_WITH_ROOT:
{
uint key_nr;
my_off_t page;
Fixed bug in undo_key_delete; Caused crashed key files in recovery Maria is now used for internal temporary tables in MySQL Better usage of VARCHAR and long strings in temporary tables Use packed fields if BLOCK_RECORD is used null_bytes are not anymore stored in a separate field New interface to remember and restore scan position Fixed bugs in unique handling Don't sync Maria temporary tables Lock control file while it's used to stop several processes from using it Changed value of MA_DONT_OVERWRITE_FILE as it collided with MY_SYNC_DIR Split MY_DONT_WAIT into MY_NO_WAIT and MY_SHORT_WAIT (for my_lock()) Added MY_FORCE_LOCK include/my_sys.h: Changed value of MA_DONT_OVERWRITE_FILE as it collided with MY_SYNC_DIR Split MY_DONT_WAIT into MY_NO_WAIT and MY_SHORT_WAIT (for my_lock()) Added MY_FORCE_LOCK include/myisam.h: Make MyISAM columndef compile time compatible with Maria mysql-test/lib/mtr_process.pl: Removed confusing warning (It's common that there is a lot of other files than pid files) mysql-test/mysql-test-run.pl: Added --sync-frm to speed up tests mysql-test/r/maria-recovery.result: Updated results from wrong push mysql-test/suite/rpl/t/rpl_innodb_bug28430.test: Marked test as --big mysys/my_lock.c: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) mysys/my_thr_init.c: Fix that we don't give name to thread before it's properly initied sql/handler.cc: Added myisam.h sql/handler.h: Changes to use Maria for internal temporary tables Removed not needed argument to restart_rnd_next() Added function remember_rnd_pos() sql/my_lock.c: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) sql/mysql_priv.h: Added maria_hton sql/sql_class.h: Changes to use Maria for internal temporary tables sql/sql_select.cc: Changes to use Maria for internal temporary tables Temporary tables didn't properly switch to dynamic row format if long strings was used Better usage of VARCHAR in temporary tables Use new interface to restart scan in duplicate removal sql/sql_select.h: Changes to use Maria for internal temporary tables sql/sql_show.cc: Changes to use Maria for internal temporary tables Removed all end space sql/sql_table.cc: Set HA_OPTION_PACK_RECORD if we are not using default or static record sql/sql_union.cc: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) sql/sql_update.cc: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) storage/maria/ha_maria.cc: Use packed fields null_bytes are not anymore stored in a separate field Changes to use Maria for internal temporary tables Give warning if we try to do an ALTER TABLE to a unusable row format storage/maria/ha_maria.h: Allow Maria with block format to restart scanning at given position storage/maria/ma_blockrec.c: Added functions to remember and restore scan position Allocate cur_row.extents so that we don't have to do a malloc on first read Fixed bug when using packed row without packed strings Removed unneeded calls to free_full_pages() Fixed unlikely bug when using old bitmap to read head page and head page had gone away Remember row position when doing undo of delete and update row (needed for undo of key delete) storage/maria/ma_blockrec.h: Added functions to remember and restore scan position storage/maria/ma_close.c: Don't sync temporary tables storage/maria/ma_control_file.c: Lock control file while it's used to stop several processes from using it storage/maria/ma_create.c: Fixed bug when using FIELD_NORMAL that was longer than FULL_PAGE_SIZE Fixed bug that casued fields to not be ordered according to offset Fixed bug in unique creation storage/maria/ma_delete.c: Don't write record reference when deleting key. (Rowid is likely to be different when we undo this) storage/maria/ma_dynrec.c: Fixed core dump when comparing records (happended in unique handling) storage/maria/ma_extra.c: MY_DONT_WAIT -> MY_SHORT_WAIT Removed TODO comment. (Was not relevant as all other instances are guranteed to be closed when we the code is excecuted) Added DBUG_ASSERT() to prove above. storage/maria/ma_key_recover.c: CLR's for UNDO_ROW_DELETE and UNDO_ROW_UPDATE now include rowid for the row. This was needed for undo_key_delete to work, as undo of delete row is likely to put row in a new position. undo_delete_key now doesn't include row position storage/maria/ma_open.c: Added virtual functions for remembering and restoring scan position Fixed wrong key search method when using multi-byte character sets (Bug#32705) Store original column number in index file NOTE: Index files are now incompatible with previous versions! (Ok as we haven't yet made a public Maria release) storage/maria/ma_recovery.c: Set info->cur_row.lastpos when reading CLR's for UNDO_ROW_DELETE or UNDO_ROW_UPDATE storage/maria/ma_scan.c: Added default function to remember and restore scan position storage/maria/maria_def.h: Added virtual functions & variables to remember and restore scan position Added MARIA_MAX_CONTROL_FILE_LOCK_RETRY storage/myisam/ha_myisam.cc: Fixed compiler errors as columdef->type is now an enum, not an integer Added functions to remember and restore scan position storage/myisam/ha_myisam.h: Added functions to remember and restore scan position storage/myisam/mi_check.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/mi_extra.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/mi_open.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/myisamdef.h: MY_DONT_WAIT -> MY_SHORT_WAIT
2007-12-17 00:17:37 +01:00
key_nr= key_nr_korr(logpos);
page= page_korr(logpos + KEY_NR_STORE_SIZE);
Fixed repair_by_sort to work with BLOCK_RECORD Fixed bugs in undo logging Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) Reserved place for reference-transid on key pages (for packing of transids) ALTER TABLE and INSERT ... SELECT now uses fast creation of index Known bugs: ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix) ma_test_recovery.excepted is not totally correct, because of the above bug mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate include/maria.h: Prototype changes Added current_filepos to st_maria_sort_info mysql-test/r/maria.result: Updated results that changes as alter table and insert ... select now uses fast creation of index mysys/mf_iocache.c: Reset variable to gurard against double invocation storage/maria/ma_bitmap.c: Added _ma_bitmap_reset_cache() (needed for repair) storage/maria/ma_blockrec.c: Simplify code More initial allocations Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) storage/maria/ma_blockrec.h: Moved TRANSID_SIZE to maria_def.h Added prototype for new functions storage/maria/ma_check.c: Simplicy code Fixed repair_by_sort to work with BLOCK_RECORD - When using BLOCK_RECORD or UNPACK create new Maria handle - Use common initializer function - Align code with maria_repair() Made some changes to maria_repair_parallel() to use common initializer function Removed ASK_MONTY section by fixing noted problem storage/maria/ma_close.c: Moved check for readonly to _ma_state_info_write() storage/maria/ma_key_recover.c: Use different log entries if key root changes or not. This fixed some bugs when tree grows storage/maria/ma_key_recover.h: Added keynr to st_msg_to_write_hook_for_undo_key storage/maria/ma_loghandler.c: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_loghandler.h: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_open.c: Added TRANSID to all key pages (for future compressing of trans id's) For compressed records, alloc a bit bigger buffer to avoid valgrind warnings If table is opened readonly, don't update state storage/maria/ma_packrec.c: Allocate bigger array for bit unpacking to avoid valgrind errors storage/maria/ma_recovery.c: Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_sort.c: More logging storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Update results Note that this is not complete becasue of a bug in recovery storage/maria/ma_test_recovery: Removed recreation of index (not needed when we have redo for index pages) storage/maria/maria_chk.c: When using flag --read-only, don't update status for files When using --unpack, don't use REPAIR_BY_SORT if other repair option is given Enable repair_by_sort for BLOCK records Removed not needed newline at start of --describe storage/maria/maria_def.h: Support for TRANSID_SIZE to key pages storage/maria/maria_read_log.c: renamed --only-display to --display-only
2007-11-28 20:38:30 +01:00
share->state.key_root[key_nr]= (page == IMPOSSIBLE_PAGE_NO ?
HA_OFFSET_ERROR :
page * share->block_size);
break;
}
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
case LOGREC_UNDO_BULK_INSERT:
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
break;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
default:
DBUG_ASSERT(0);
}
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
if (row_entry && share->calc_checksum)
Fixed bug in undo_key_delete; Caused crashed key files in recovery Maria is now used for internal temporary tables in MySQL Better usage of VARCHAR and long strings in temporary tables Use packed fields if BLOCK_RECORD is used null_bytes are not anymore stored in a separate field New interface to remember and restore scan position Fixed bugs in unique handling Don't sync Maria temporary tables Lock control file while it's used to stop several processes from using it Changed value of MA_DONT_OVERWRITE_FILE as it collided with MY_SYNC_DIR Split MY_DONT_WAIT into MY_NO_WAIT and MY_SHORT_WAIT (for my_lock()) Added MY_FORCE_LOCK include/my_sys.h: Changed value of MA_DONT_OVERWRITE_FILE as it collided with MY_SYNC_DIR Split MY_DONT_WAIT into MY_NO_WAIT and MY_SHORT_WAIT (for my_lock()) Added MY_FORCE_LOCK include/myisam.h: Make MyISAM columndef compile time compatible with Maria mysql-test/lib/mtr_process.pl: Removed confusing warning (It's common that there is a lot of other files than pid files) mysql-test/mysql-test-run.pl: Added --sync-frm to speed up tests mysql-test/r/maria-recovery.result: Updated results from wrong push mysql-test/suite/rpl/t/rpl_innodb_bug28430.test: Marked test as --big mysys/my_lock.c: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) mysys/my_thr_init.c: Fix that we don't give name to thread before it's properly initied sql/handler.cc: Added myisam.h sql/handler.h: Changes to use Maria for internal temporary tables Removed not needed argument to restart_rnd_next() Added function remember_rnd_pos() sql/my_lock.c: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) sql/mysql_priv.h: Added maria_hton sql/sql_class.h: Changes to use Maria for internal temporary tables sql/sql_select.cc: Changes to use Maria for internal temporary tables Temporary tables didn't properly switch to dynamic row format if long strings was used Better usage of VARCHAR in temporary tables Use new interface to restart scan in duplicate removal sql/sql_select.h: Changes to use Maria for internal temporary tables sql/sql_show.cc: Changes to use Maria for internal temporary tables Removed all end space sql/sql_table.cc: Set HA_OPTION_PACK_RECORD if we are not using default or static record sql/sql_union.cc: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) sql/sql_update.cc: If MY_FORCE_LOCK is given, use locking even if my_disable_locking is given If MY_NO_WAIT is given, return at once if lock is occupied If MY_SHORT_WAIT is given, wait some time for lock before returning (This was called MY_DONT_WAIT before) storage/maria/ha_maria.cc: Use packed fields null_bytes are not anymore stored in a separate field Changes to use Maria for internal temporary tables Give warning if we try to do an ALTER TABLE to a unusable row format storage/maria/ha_maria.h: Allow Maria with block format to restart scanning at given position storage/maria/ma_blockrec.c: Added functions to remember and restore scan position Allocate cur_row.extents so that we don't have to do a malloc on first read Fixed bug when using packed row without packed strings Removed unneeded calls to free_full_pages() Fixed unlikely bug when using old bitmap to read head page and head page had gone away Remember row position when doing undo of delete and update row (needed for undo of key delete) storage/maria/ma_blockrec.h: Added functions to remember and restore scan position storage/maria/ma_close.c: Don't sync temporary tables storage/maria/ma_control_file.c: Lock control file while it's used to stop several processes from using it storage/maria/ma_create.c: Fixed bug when using FIELD_NORMAL that was longer than FULL_PAGE_SIZE Fixed bug that casued fields to not be ordered according to offset Fixed bug in unique creation storage/maria/ma_delete.c: Don't write record reference when deleting key. (Rowid is likely to be different when we undo this) storage/maria/ma_dynrec.c: Fixed core dump when comparing records (happended in unique handling) storage/maria/ma_extra.c: MY_DONT_WAIT -> MY_SHORT_WAIT Removed TODO comment. (Was not relevant as all other instances are guranteed to be closed when we the code is excecuted) Added DBUG_ASSERT() to prove above. storage/maria/ma_key_recover.c: CLR's for UNDO_ROW_DELETE and UNDO_ROW_UPDATE now include rowid for the row. This was needed for undo_key_delete to work, as undo of delete row is likely to put row in a new position. undo_delete_key now doesn't include row position storage/maria/ma_open.c: Added virtual functions for remembering and restoring scan position Fixed wrong key search method when using multi-byte character sets (Bug#32705) Store original column number in index file NOTE: Index files are now incompatible with previous versions! (Ok as we haven't yet made a public Maria release) storage/maria/ma_recovery.c: Set info->cur_row.lastpos when reading CLR's for UNDO_ROW_DELETE or UNDO_ROW_UPDATE storage/maria/ma_scan.c: Added default function to remember and restore scan position storage/maria/maria_def.h: Added virtual functions & variables to remember and restore scan position Added MARIA_MAX_CONTROL_FILE_LOCK_RETRY storage/myisam/ha_myisam.cc: Fixed compiler errors as columdef->type is now an enum, not an integer Added functions to remember and restore scan position storage/myisam/ha_myisam.h: Added functions to remember and restore scan position storage/myisam/mi_check.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/mi_extra.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/mi_open.c: MY_DONT_WAIT -> MY_SHORT_WAIT storage/myisam/myisamdef.h: MY_DONT_WAIT -> MY_SHORT_WAIT
2007-12-17 00:17:37 +01:00
share->state.state.checksum+= ha_checksum_korr(logpos);
Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type Added --zerofill (-z) option to maria_chk (Mostly code from Jani) Added now table states ZEROFILLED and MOVEABLE Set state.changed with new states when things changes include/maria.h: Added maria_zerofill include/myisamchk.h: Added option for zerofill Extend testflag to be 64 to allow for more flags mysql-test/r/create.result: Updated results after merge mysql-test/r/maria.result: Updated results after merge mysys/my_getopt.c: Removed not used variable sql/sql_show.cc: Fixed wrong page type storage/maria/ma_blockrec.c: Renamed compact_page() to ma_compact_block_page() and made it global Always zerofill half filled blob pages Set share.state.changed on REDO storage/maria/ma_blockrec.h: Added _ma_compact_block_page() storage/maria/ma_check.c: Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type This gives the following benefits: - Table is smaller if compressed - All LSN are removed for transactinal tables and this makes them movable between systems Dont set table states of we are using --quick Changed log entry for repair to use 8 bytes for flag storage/maria/ma_delete.c: Simplify code Update state.changed storage/maria/ma_key_recover.c: Update state.changed storage/maria/ma_locking.c: Set uuid for file on first change if it's not set (table was cleared with zerofill) storage/maria/ma_loghandler.c: Updated log entry for REDO_REPAIR_TABLE storage/maria/ma_recovery.c: Updated log entry for REDO_REPAIR_TABLE (flag is now 8 bytes) Set new bits in state.changed storage/maria/ma_test_all.sh: Nicer output storage/maria/ma_test_recovery.expected: Updated results (now states flags are visible) storage/maria/ma_update.c: Update state.changed storage/maria/ma_write.c: Simplify code Set state.changed storage/maria/maria_chk.c: Added option --zerofill Added printing of states for MOVABLE and ZEROFILLED MYD -> MAD MYI -> MAI storage/maria/maria_def.h: Added states STATE_NOT_MOVABLE and STATE_NOT_ZEROFILLED Added prototype for new functions storage/maria/unittest/ma_test_all-t: More tests, including tests for zerofill Removed some not needed 'print' statements storage/maria/unittest/ma_test_loghandler_multithread-t.c: Smaller buffer to not trash devlopment machines totally
2007-12-31 10:55:46 +01:00
share->state.changed|= (STATE_CHANGED | STATE_NOT_ANALYZED |
STATE_NOT_ZEROFILLED | STATE_NOT_MOVABLE);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
}
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
if (row_entry)
tprint(tracef, " rows' count %lu\n", (ulong)share->state.state.records);
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
_ma_unpin_all_pages(info, rec->lsn);
Fixed repair_by_sort to work with BLOCK_RECORD Fixed bugs in undo logging Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) Reserved place for reference-transid on key pages (for packing of transids) ALTER TABLE and INSERT ... SELECT now uses fast creation of index Known bugs: ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix) ma_test_recovery.excepted is not totally correct, because of the above bug mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate include/maria.h: Prototype changes Added current_filepos to st_maria_sort_info mysql-test/r/maria.result: Updated results that changes as alter table and insert ... select now uses fast creation of index mysys/mf_iocache.c: Reset variable to gurard against double invocation storage/maria/ma_bitmap.c: Added _ma_bitmap_reset_cache() (needed for repair) storage/maria/ma_blockrec.c: Simplify code More initial allocations Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) storage/maria/ma_blockrec.h: Moved TRANSID_SIZE to maria_def.h Added prototype for new functions storage/maria/ma_check.c: Simplicy code Fixed repair_by_sort to work with BLOCK_RECORD - When using BLOCK_RECORD or UNPACK create new Maria handle - Use common initializer function - Align code with maria_repair() Made some changes to maria_repair_parallel() to use common initializer function Removed ASK_MONTY section by fixing noted problem storage/maria/ma_close.c: Moved check for readonly to _ma_state_info_write() storage/maria/ma_key_recover.c: Use different log entries if key root changes or not. This fixed some bugs when tree grows storage/maria/ma_key_recover.h: Added keynr to st_msg_to_write_hook_for_undo_key storage/maria/ma_loghandler.c: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_loghandler.h: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_open.c: Added TRANSID to all key pages (for future compressing of trans id's) For compressed records, alloc a bit bigger buffer to avoid valgrind warnings If table is opened readonly, don't update state storage/maria/ma_packrec.c: Allocate bigger array for bit unpacking to avoid valgrind errors storage/maria/ma_recovery.c: Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_sort.c: More logging storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Update results Note that this is not complete becasue of a bug in recovery storage/maria/ma_test_recovery: Removed recreation of index (not needed when we have redo for index pages) storage/maria/maria_chk.c: When using flag --read-only, don't update status for files When using --unpack, don't use REPAIR_BY_SORT if other repair option is given Enable repair_by_sort for BLOCK records Removed not needed newline at start of --describe storage/maria/maria_def.h: Support for TRANSID_SIZE to key pages storage/maria/maria_read_log.c: renamed --only-display to --display-only
2007-11-28 20:38:30 +01:00
DBUG_RETURN(0);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
}
/**
Hock to print debug information (like MySQL query)
*/
prototype_redo_exec_hook(DEBUG_INFO)
{
uchar *data;
enum translog_debug_info_type debug_info;
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
eprint(tracef, "Failed to read record debug record");
return 1;
}
debug_info= (enum translog_debug_info_type) log_record_buffer.str[0];
data= log_record_buffer.str + 1;
switch (debug_info) {
case LOGREC_DEBUG_INFO_QUERY:
tprint(tracef, "Query: %.*s\n", rec->record_length - 1,
(char*) data);
break;
default:
DBUG_ASSERT(0);
}
return 0;
}
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
/**
In some cases we have to skip execution of an UNDO record during the UNDO
phase.
*/
static void skip_undo_record(LSN previous_undo_lsn, TRN *trn)
{
trn->undo_lsn= previous_undo_lsn;
if (previous_undo_lsn == LSN_IMPOSSIBLE) /* has fully rolled back */
trn->first_undo_lsn= LSN_WITH_FLAGS_TO_FLAGS(trn->first_undo_lsn);
skipped_undo_phase++;
}
prototype_undo_exec_hook(UNDO_ROW_INSERT)
{
my_bool error;
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
LSN previous_undo_lsn= lsn_korr(rec->header);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
const uchar *record_ptr;
if (info == NULL || maria_is_crashed(info))
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
{
/*
Unlike for REDOs, if the table was skipped it is abnormal; we have a
transaction to rollback which used this table, as it is not rolled back
it was supposed to hold this table and so the table should still be
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
there. Skip it (user may have repaired the table with maria_chk because
it was so badly corrupted that a previous recovery failed) but warn.
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
*/
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
skip_undo_record(previous_undo_lsn, trn);
return 0;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type Added --zerofill (-z) option to maria_chk (Mostly code from Jani) Added now table states ZEROFILLED and MOVEABLE Set state.changed with new states when things changes include/maria.h: Added maria_zerofill include/myisamchk.h: Added option for zerofill Extend testflag to be 64 to allow for more flags mysql-test/r/create.result: Updated results after merge mysql-test/r/maria.result: Updated results after merge mysys/my_getopt.c: Removed not used variable sql/sql_show.cc: Fixed wrong page type storage/maria/ma_blockrec.c: Renamed compact_page() to ma_compact_block_page() and made it global Always zerofill half filled blob pages Set share.state.changed on REDO storage/maria/ma_blockrec.h: Added _ma_compact_block_page() storage/maria/ma_check.c: Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type This gives the following benefits: - Table is smaller if compressed - All LSN are removed for transactinal tables and this makes them movable between systems Dont set table states of we are using --quick Changed log entry for repair to use 8 bytes for flag storage/maria/ma_delete.c: Simplify code Update state.changed storage/maria/ma_key_recover.c: Update state.changed storage/maria/ma_locking.c: Set uuid for file on first change if it's not set (table was cleared with zerofill) storage/maria/ma_loghandler.c: Updated log entry for REDO_REPAIR_TABLE storage/maria/ma_recovery.c: Updated log entry for REDO_REPAIR_TABLE (flag is now 8 bytes) Set new bits in state.changed storage/maria/ma_test_all.sh: Nicer output storage/maria/ma_test_recovery.expected: Updated results (now states flags are visible) storage/maria/ma_update.c: Update state.changed storage/maria/ma_write.c: Simplify code Set state.changed storage/maria/maria_chk.c: Added option --zerofill Added printing of states for MOVABLE and ZEROFILLED MYD -> MAD MYI -> MAI storage/maria/maria_def.h: Added states STATE_NOT_MOVABLE and STATE_NOT_ZEROFILLED Added prototype for new functions storage/maria/unittest/ma_test_all-t: More tests, including tests for zerofill Removed some not needed 'print' statements storage/maria/unittest/ma_test_loghandler_multithread-t.c: Smaller buffer to not trash devlopment machines totally
2007-12-31 10:55:46 +01:00
share->state.changed|= (STATE_CHANGED | STATE_NOT_ANALYZED |
STATE_NOT_OPTIMIZED_ROWS | STATE_NOT_ZEROFILLED |
STATE_NOT_MOVABLE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
record_ptr= rec->header;
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
if (share->calc_checksum)
{
/*
We need to read more of the record to put the checksum into the record
buffer used by _ma_apply_undo_row_insert().
If the table has no live checksum, rec->header will be enough.
*/
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
return 1;
}
record_ptr= log_record_buffer.str;
}
info->trn= trn;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
error= _ma_apply_undo_row_insert(info, previous_undo_lsn,
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
record_ptr + LSN_STORE_SIZE +
FILEID_STORE_SIZE);
info->trn= 0;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
/* trn->undo_lsn is updated in an inwrite_hook when writing the CLR_END */
tprint(tracef, " rows' count %lu\n", (ulong)info->s->state.state.records);
tprint(tracef, " undo_lsn now LSN (%lu,0x%lx)\n",
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
LSN_IN_PARTS(trn->undo_lsn));
return error;
}
prototype_undo_exec_hook(UNDO_ROW_DELETE)
{
my_bool error;
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
LSN previous_undo_lsn= lsn_korr(rec->header);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
if (info == NULL || maria_is_crashed(info))
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
{
skip_undo_record(previous_undo_lsn, trn);
return 0;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type Added --zerofill (-z) option to maria_chk (Mostly code from Jani) Added now table states ZEROFILLED and MOVEABLE Set state.changed with new states when things changes include/maria.h: Added maria_zerofill include/myisamchk.h: Added option for zerofill Extend testflag to be 64 to allow for more flags mysql-test/r/create.result: Updated results after merge mysql-test/r/maria.result: Updated results after merge mysys/my_getopt.c: Removed not used variable sql/sql_show.cc: Fixed wrong page type storage/maria/ma_blockrec.c: Renamed compact_page() to ma_compact_block_page() and made it global Always zerofill half filled blob pages Set share.state.changed on REDO storage/maria/ma_blockrec.h: Added _ma_compact_block_page() storage/maria/ma_check.c: Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type This gives the following benefits: - Table is smaller if compressed - All LSN are removed for transactinal tables and this makes them movable between systems Dont set table states of we are using --quick Changed log entry for repair to use 8 bytes for flag storage/maria/ma_delete.c: Simplify code Update state.changed storage/maria/ma_key_recover.c: Update state.changed storage/maria/ma_locking.c: Set uuid for file on first change if it's not set (table was cleared with zerofill) storage/maria/ma_loghandler.c: Updated log entry for REDO_REPAIR_TABLE storage/maria/ma_recovery.c: Updated log entry for REDO_REPAIR_TABLE (flag is now 8 bytes) Set new bits in state.changed storage/maria/ma_test_all.sh: Nicer output storage/maria/ma_test_recovery.expected: Updated results (now states flags are visible) storage/maria/ma_update.c: Update state.changed storage/maria/ma_write.c: Simplify code Set state.changed storage/maria/maria_chk.c: Added option --zerofill Added printing of states for MOVABLE and ZEROFILLED MYD -> MAD MYI -> MAI storage/maria/maria_def.h: Added states STATE_NOT_MOVABLE and STATE_NOT_ZEROFILLED Added prototype for new functions storage/maria/unittest/ma_test_all-t: More tests, including tests for zerofill Removed some not needed 'print' statements storage/maria/unittest/ma_test_loghandler_multithread-t.c: Smaller buffer to not trash devlopment machines totally
2007-12-31 10:55:46 +01:00
share->state.changed|= (STATE_CHANGED | STATE_NOT_ANALYZED |
STATE_NOT_ZEROFILLED | STATE_NOT_MOVABLE);
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
return 1;
}
info->trn= trn;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
error= _ma_apply_undo_row_delete(info, previous_undo_lsn,
log_record_buffer.str + LSN_STORE_SIZE +
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
FILEID_STORE_SIZE,
rec->record_length -
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
(LSN_STORE_SIZE + FILEID_STORE_SIZE));
info->trn= 0;
tprint(tracef, " rows' count %lu\n undo_lsn now LSN (%lu,0x%lx)\n",
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
(ulong)share->state.state.records, LSN_IN_PARTS(trn->undo_lsn));
return error;
}
prototype_undo_exec_hook(UNDO_ROW_UPDATE)
{
my_bool error;
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
LSN previous_undo_lsn= lsn_korr(rec->header);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
MARIA_SHARE *share;
if (info == NULL || maria_is_crashed(info))
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
{
skip_undo_record(previous_undo_lsn, trn);
return 0;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
share= info->s;
Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type Added --zerofill (-z) option to maria_chk (Mostly code from Jani) Added now table states ZEROFILLED and MOVEABLE Set state.changed with new states when things changes include/maria.h: Added maria_zerofill include/myisamchk.h: Added option for zerofill Extend testflag to be 64 to allow for more flags mysql-test/r/create.result: Updated results after merge mysql-test/r/maria.result: Updated results after merge mysys/my_getopt.c: Removed not used variable sql/sql_show.cc: Fixed wrong page type storage/maria/ma_blockrec.c: Renamed compact_page() to ma_compact_block_page() and made it global Always zerofill half filled blob pages Set share.state.changed on REDO storage/maria/ma_blockrec.h: Added _ma_compact_block_page() storage/maria/ma_check.c: Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type This gives the following benefits: - Table is smaller if compressed - All LSN are removed for transactinal tables and this makes them movable between systems Dont set table states of we are using --quick Changed log entry for repair to use 8 bytes for flag storage/maria/ma_delete.c: Simplify code Update state.changed storage/maria/ma_key_recover.c: Update state.changed storage/maria/ma_locking.c: Set uuid for file on first change if it's not set (table was cleared with zerofill) storage/maria/ma_loghandler.c: Updated log entry for REDO_REPAIR_TABLE storage/maria/ma_recovery.c: Updated log entry for REDO_REPAIR_TABLE (flag is now 8 bytes) Set new bits in state.changed storage/maria/ma_test_all.sh: Nicer output storage/maria/ma_test_recovery.expected: Updated results (now states flags are visible) storage/maria/ma_update.c: Update state.changed storage/maria/ma_write.c: Simplify code Set state.changed storage/maria/maria_chk.c: Added option --zerofill Added printing of states for MOVABLE and ZEROFILLED MYD -> MAD MYI -> MAI storage/maria/maria_def.h: Added states STATE_NOT_MOVABLE and STATE_NOT_ZEROFILLED Added prototype for new functions storage/maria/unittest/ma_test_all-t: More tests, including tests for zerofill Removed some not needed 'print' statements storage/maria/unittest/ma_test_loghandler_multithread-t.c: Smaller buffer to not trash devlopment machines totally
2007-12-31 10:55:46 +01:00
share->state.changed|= (STATE_CHANGED | STATE_NOT_ANALYZED |
STATE_NOT_ZEROFILLED | STATE_NOT_MOVABLE);
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
return 1;
}
info->trn= trn;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
error= _ma_apply_undo_row_update(info, previous_undo_lsn,
log_record_buffer.str + LSN_STORE_SIZE +
Added applying of undo for updates Fixed bug in duplicate key handling for block records during repair All read-row methods now return error number in case of error Don't calculate checksum for null fields Fixed bug when running maria_read_log with -o BUILD/SETUP.sh: Added STACK_DIRECTION BUILD/compile-pentium-debug-max: Moved STACK_DIRECTION to SETUP include/myisam.h: Added extra parameter to write_key storage/maria/ma_blockrec.c: Added applying of undo for updates Fixed indentation Removed some not needed casts Fixed wrong logging of CLR record Split ma_update_block_record to two functions to be able to reuse it from undo-applying Simplify filling of packed fields ma_record_block_record) now returns error number on failure Sligtly changed log record information for undo-update storage/maria/ma_check.c: Fixed bug in duplicate key handling for block records during repair storage/maria/ma_checksum.c: Don't calculate checksum for null fields storage/maria/ma_dynrec.c: _ma_read_dynamic_reocrd() now returns error number on error Rest of the changes are code simplification and indentation fixes storage/maria/ma_locking.c: Added comment storage/maria/ma_loghandler.c: More debugging Removed printing of total_record_length as this was always same as record_length storage/maria/ma_open.c: Allocate bitmap for changed fields storage/maria/ma_packrec.c: read_record now returns error number on error storage/maria/ma_recovery.c: Fixed wrong arguments to undo_row_update storage/maria/ma_statrec.c: read_record now returns error number on error (not 1) Code simplification storage/maria/ma_test1.c: Added exit possibility after update phase (to test undo of updates) storage/maria/maria_def.h: Include bitmap header file storage/maria/maria_read_log.c: Fixed bug when running with -o
2007-09-09 18:15:10 +02:00
FILEID_STORE_SIZE,
rec->record_length -
Added applying of undo for updates Fixed bug in duplicate key handling for block records during repair All read-row methods now return error number in case of error Don't calculate checksum for null fields Fixed bug when running maria_read_log with -o BUILD/SETUP.sh: Added STACK_DIRECTION BUILD/compile-pentium-debug-max: Moved STACK_DIRECTION to SETUP include/myisam.h: Added extra parameter to write_key storage/maria/ma_blockrec.c: Added applying of undo for updates Fixed indentation Removed some not needed casts Fixed wrong logging of CLR record Split ma_update_block_record to two functions to be able to reuse it from undo-applying Simplify filling of packed fields ma_record_block_record) now returns error number on failure Sligtly changed log record information for undo-update storage/maria/ma_check.c: Fixed bug in duplicate key handling for block records during repair storage/maria/ma_checksum.c: Don't calculate checksum for null fields storage/maria/ma_dynrec.c: _ma_read_dynamic_reocrd() now returns error number on error Rest of the changes are code simplification and indentation fixes storage/maria/ma_locking.c: Added comment storage/maria/ma_loghandler.c: More debugging Removed printing of total_record_length as this was always same as record_length storage/maria/ma_open.c: Allocate bitmap for changed fields storage/maria/ma_packrec.c: read_record now returns error number on error storage/maria/ma_recovery.c: Fixed wrong arguments to undo_row_update storage/maria/ma_statrec.c: read_record now returns error number on error (not 1) Code simplification storage/maria/ma_test1.c: Added exit possibility after update phase (to test undo of updates) storage/maria/maria_def.h: Include bitmap header file storage/maria/maria_read_log.c: Fixed bug when running with -o
2007-09-09 18:15:10 +02:00
(LSN_STORE_SIZE + FILEID_STORE_SIZE));
info->trn= 0;
tprint(tracef, " undo_lsn now LSN (%lu,0x%lx)\n",
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
LSN_IN_PARTS(trn->undo_lsn));
return error;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
prototype_undo_exec_hook(UNDO_KEY_INSERT)
{
my_bool error;
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
LSN previous_undo_lsn= lsn_korr(rec->header);
MARIA_SHARE *share;
if (info == NULL || maria_is_crashed(info))
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
{
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
skip_undo_record(previous_undo_lsn, trn);
return 0;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
}
share= info->s;
Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type Added --zerofill (-z) option to maria_chk (Mostly code from Jani) Added now table states ZEROFILLED and MOVEABLE Set state.changed with new states when things changes include/maria.h: Added maria_zerofill include/myisamchk.h: Added option for zerofill Extend testflag to be 64 to allow for more flags mysql-test/r/create.result: Updated results after merge mysql-test/r/maria.result: Updated results after merge mysys/my_getopt.c: Removed not used variable sql/sql_show.cc: Fixed wrong page type storage/maria/ma_blockrec.c: Renamed compact_page() to ma_compact_block_page() and made it global Always zerofill half filled blob pages Set share.state.changed on REDO storage/maria/ma_blockrec.h: Added _ma_compact_block_page() storage/maria/ma_check.c: Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type This gives the following benefits: - Table is smaller if compressed - All LSN are removed for transactinal tables and this makes them movable between systems Dont set table states of we are using --quick Changed log entry for repair to use 8 bytes for flag storage/maria/ma_delete.c: Simplify code Update state.changed storage/maria/ma_key_recover.c: Update state.changed storage/maria/ma_locking.c: Set uuid for file on first change if it's not set (table was cleared with zerofill) storage/maria/ma_loghandler.c: Updated log entry for REDO_REPAIR_TABLE storage/maria/ma_recovery.c: Updated log entry for REDO_REPAIR_TABLE (flag is now 8 bytes) Set new bits in state.changed storage/maria/ma_test_all.sh: Nicer output storage/maria/ma_test_recovery.expected: Updated results (now states flags are visible) storage/maria/ma_update.c: Update state.changed storage/maria/ma_write.c: Simplify code Set state.changed storage/maria/maria_chk.c: Added option --zerofill Added printing of states for MOVABLE and ZEROFILLED MYD -> MAD MYI -> MAI storage/maria/maria_def.h: Added states STATE_NOT_MOVABLE and STATE_NOT_ZEROFILLED Added prototype for new functions storage/maria/unittest/ma_test_all-t: More tests, including tests for zerofill Removed some not needed 'print' statements storage/maria/unittest/ma_test_loghandler_multithread-t.c: Smaller buffer to not trash devlopment machines totally
2007-12-31 10:55:46 +01:00
share->state.changed|= (STATE_CHANGED | STATE_NOT_ANALYZED |
STATE_NOT_ZEROFILLED | STATE_NOT_MOVABLE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 1;
}
info->trn= trn;
error= _ma_apply_undo_key_insert(info, previous_undo_lsn,
log_record_buffer.str + LSN_STORE_SIZE +
FILEID_STORE_SIZE,
rec->record_length - LSN_STORE_SIZE -
FILEID_STORE_SIZE);
info->trn= 0;
/* trn->undo_lsn is updated in an inwrite_hook when writing the CLR_END */
tprint(tracef, " undo_lsn now LSN (%lu,0x%lx)\n",
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
LSN_IN_PARTS(trn->undo_lsn));
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return error;
}
prototype_undo_exec_hook(UNDO_KEY_DELETE)
{
my_bool error;
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
LSN previous_undo_lsn= lsn_korr(rec->header);
MARIA_SHARE *share;
if (info == NULL || maria_is_crashed(info))
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
{
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
skip_undo_record(previous_undo_lsn, trn);
return 0;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
}
share= info->s;
Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type Added --zerofill (-z) option to maria_chk (Mostly code from Jani) Added now table states ZEROFILLED and MOVEABLE Set state.changed with new states when things changes include/maria.h: Added maria_zerofill include/myisamchk.h: Added option for zerofill Extend testflag to be 64 to allow for more flags mysql-test/r/create.result: Updated results after merge mysql-test/r/maria.result: Updated results after merge mysys/my_getopt.c: Removed not used variable sql/sql_show.cc: Fixed wrong page type storage/maria/ma_blockrec.c: Renamed compact_page() to ma_compact_block_page() and made it global Always zerofill half filled blob pages Set share.state.changed on REDO storage/maria/ma_blockrec.h: Added _ma_compact_block_page() storage/maria/ma_check.c: Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type This gives the following benefits: - Table is smaller if compressed - All LSN are removed for transactinal tables and this makes them movable between systems Dont set table states of we are using --quick Changed log entry for repair to use 8 bytes for flag storage/maria/ma_delete.c: Simplify code Update state.changed storage/maria/ma_key_recover.c: Update state.changed storage/maria/ma_locking.c: Set uuid for file on first change if it's not set (table was cleared with zerofill) storage/maria/ma_loghandler.c: Updated log entry for REDO_REPAIR_TABLE storage/maria/ma_recovery.c: Updated log entry for REDO_REPAIR_TABLE (flag is now 8 bytes) Set new bits in state.changed storage/maria/ma_test_all.sh: Nicer output storage/maria/ma_test_recovery.expected: Updated results (now states flags are visible) storage/maria/ma_update.c: Update state.changed storage/maria/ma_write.c: Simplify code Set state.changed storage/maria/maria_chk.c: Added option --zerofill Added printing of states for MOVABLE and ZEROFILLED MYD -> MAD MYI -> MAI storage/maria/maria_def.h: Added states STATE_NOT_MOVABLE and STATE_NOT_ZEROFILLED Added prototype for new functions storage/maria/unittest/ma_test_all-t: More tests, including tests for zerofill Removed some not needed 'print' statements storage/maria/unittest/ma_test_loghandler_multithread-t.c: Smaller buffer to not trash devlopment machines totally
2007-12-31 10:55:46 +01:00
share->state.changed|= (STATE_CHANGED | STATE_NOT_ANALYZED |
STATE_NOT_ZEROFILLED | STATE_NOT_MOVABLE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 1;
}
info->trn= trn;
error= _ma_apply_undo_key_delete(info, previous_undo_lsn,
log_record_buffer.str + LSN_STORE_SIZE +
FILEID_STORE_SIZE,
rec->record_length - LSN_STORE_SIZE -
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
FILEID_STORE_SIZE, FALSE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
info->trn= 0;
/* trn->undo_lsn is updated in an inwrite_hook when writing the CLR_END */
tprint(tracef, " undo_lsn now LSN (%lu,0x%lx)\n",
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
LSN_IN_PARTS(trn->undo_lsn));
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return error;
}
prototype_undo_exec_hook(UNDO_KEY_DELETE_WITH_ROOT)
{
my_bool error;
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
LSN previous_undo_lsn= lsn_korr(rec->header);
MARIA_SHARE *share;
if (info == NULL || maria_is_crashed(info))
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
{
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
skip_undo_record(previous_undo_lsn, trn);
return 0;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
}
share= info->s;
Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type Added --zerofill (-z) option to maria_chk (Mostly code from Jani) Added now table states ZEROFILLED and MOVEABLE Set state.changed with new states when things changes include/maria.h: Added maria_zerofill include/myisamchk.h: Added option for zerofill Extend testflag to be 64 to allow for more flags mysql-test/r/create.result: Updated results after merge mysql-test/r/maria.result: Updated results after merge mysys/my_getopt.c: Removed not used variable sql/sql_show.cc: Fixed wrong page type storage/maria/ma_blockrec.c: Renamed compact_page() to ma_compact_block_page() and made it global Always zerofill half filled blob pages Set share.state.changed on REDO storage/maria/ma_blockrec.h: Added _ma_compact_block_page() storage/maria/ma_check.c: Added maria_zerofill() This is used to bzero all not used parts of the index pages and compact and bzero the not used parts of the data pages of block-record type This gives the following benefits: - Table is smaller if compressed - All LSN are removed for transactinal tables and this makes them movable between systems Dont set table states of we are using --quick Changed log entry for repair to use 8 bytes for flag storage/maria/ma_delete.c: Simplify code Update state.changed storage/maria/ma_key_recover.c: Update state.changed storage/maria/ma_locking.c: Set uuid for file on first change if it's not set (table was cleared with zerofill) storage/maria/ma_loghandler.c: Updated log entry for REDO_REPAIR_TABLE storage/maria/ma_recovery.c: Updated log entry for REDO_REPAIR_TABLE (flag is now 8 bytes) Set new bits in state.changed storage/maria/ma_test_all.sh: Nicer output storage/maria/ma_test_recovery.expected: Updated results (now states flags are visible) storage/maria/ma_update.c: Update state.changed storage/maria/ma_write.c: Simplify code Set state.changed storage/maria/maria_chk.c: Added option --zerofill Added printing of states for MOVABLE and ZEROFILLED MYD -> MAD MYI -> MAI storage/maria/maria_def.h: Added states STATE_NOT_MOVABLE and STATE_NOT_ZEROFILLED Added prototype for new functions storage/maria/unittest/ma_test_all-t: More tests, including tests for zerofill Removed some not needed 'print' statements storage/maria/unittest/ma_test_loghandler_multithread-t.c: Smaller buffer to not trash devlopment machines totally
2007-12-31 10:55:46 +01:00
share->state.changed|= (STATE_CHANGED | STATE_NOT_ANALYZED |
STATE_NOT_ZEROFILLED | STATE_NOT_MOVABLE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
enlarge_buffer(rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec->lsn, 0, rec->record_length,
log_record_buffer.str, NULL) !=
rec->record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return 1;
}
info->trn= trn;
error= _ma_apply_undo_key_delete(info, previous_undo_lsn,
log_record_buffer.str + LSN_STORE_SIZE +
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
FILEID_STORE_SIZE,
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
rec->record_length - LSN_STORE_SIZE -
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
FILEID_STORE_SIZE, TRUE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
info->trn= 0;
/* trn->undo_lsn is updated in an inwrite_hook when writing the CLR_END */
tprint(tracef, " undo_lsn now LSN (%lu,0x%lx)\n",
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
LSN_IN_PARTS(trn->undo_lsn));
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
return error;
}
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
prototype_undo_exec_hook(UNDO_BULK_INSERT)
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
{
my_bool error;
MARIA_HA *info= get_MARIA_HA_from_UNDO_record(rec);
LSN previous_undo_lsn= lsn_korr(rec->header);
MARIA_SHARE *share;
/* Here we don't check for crashed as we can undo the bulk insert */
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
if (info == NULL)
{
skip_undo_record(previous_undo_lsn, trn);
return 0;
}
share= info->s;
share->state.changed|= (STATE_CHANGED | STATE_NOT_ANALYZED |
STATE_NOT_ZEROFILLED | STATE_NOT_MOVABLE);
info->trn= trn;
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
error= _ma_apply_undo_bulk_insert(info, previous_undo_lsn);
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
info->trn= 0;
/* trn->undo_lsn is updated in an inwrite_hook when writing the CLR_END */
tprint(tracef, " undo_lsn now LSN (%lu,0x%lx)\n",
LSN_IN_PARTS(trn->undo_lsn));
return error;
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
static int run_redo_phase(LSN lsn, LSN lsn_end, enum maria_apply_log_way apply)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
TRANSLOG_HEADER_BUFFER rec;
struct st_translog_scanner_data scanner;
int len;
uint i;
DBUG_ENTER("run_redo_phase");
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
/* install hooks for execution */
#define install_redo_exec_hook(R) \
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
log_record_type_descriptor[LOGREC_ ## R].record_execute_in_redo_phase= \
exec_REDO_LOGREC_ ## R;
#define install_redo_exec_hook_shared(R,S) \
log_record_type_descriptor[LOGREC_ ## R].record_execute_in_redo_phase= \
exec_REDO_LOGREC_ ## S;
#define install_undo_exec_hook(R) \
log_record_type_descriptor[LOGREC_ ## R].record_execute_in_undo_phase= \
exec_UNDO_LOGREC_ ## R;
install_redo_exec_hook(LONG_TRANSACTION_ID);
install_redo_exec_hook(CHECKPOINT);
install_redo_exec_hook(REDO_CREATE_TABLE);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
install_redo_exec_hook(REDO_RENAME_TABLE);
install_redo_exec_hook(REDO_REPAIR_TABLE);
install_redo_exec_hook(REDO_DROP_TABLE);
install_redo_exec_hook(FILE_ID);
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
install_redo_exec_hook(INCOMPLETE_LOG);
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
install_redo_exec_hook(INCOMPLETE_GROUP);
install_redo_exec_hook(REDO_INSERT_ROW_HEAD);
install_redo_exec_hook(REDO_INSERT_ROW_TAIL);
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
install_redo_exec_hook(REDO_INSERT_ROW_BLOBS);
install_redo_exec_hook(REDO_PURGE_ROW_HEAD);
install_redo_exec_hook(REDO_PURGE_ROW_TAIL);
Merge some changes from sql directory in 5.1 tree Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs
2007-10-19 23:24:22 +02:00
install_redo_exec_hook(REDO_FREE_HEAD_OR_TAIL);
install_redo_exec_hook(REDO_FREE_BLOCKS);
install_redo_exec_hook(REDO_DELETE_ALL);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
install_redo_exec_hook(REDO_INDEX);
install_redo_exec_hook(REDO_INDEX_NEW_PAGE);
install_redo_exec_hook(REDO_INDEX_FREE_PAGE);
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
install_redo_exec_hook(REDO_BITMAP_NEW_PAGE);
install_redo_exec_hook(UNDO_ROW_INSERT);
install_redo_exec_hook(UNDO_ROW_DELETE);
install_redo_exec_hook(UNDO_ROW_UPDATE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
install_redo_exec_hook(UNDO_KEY_INSERT);
install_redo_exec_hook(UNDO_KEY_DELETE);
install_redo_exec_hook(UNDO_KEY_DELETE_WITH_ROOT);
install_redo_exec_hook(COMMIT);
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
install_redo_exec_hook(CLR_END);
install_undo_exec_hook(UNDO_ROW_INSERT);
install_undo_exec_hook(UNDO_ROW_DELETE);
install_undo_exec_hook(UNDO_ROW_UPDATE);
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
install_undo_exec_hook(UNDO_KEY_INSERT);
install_undo_exec_hook(UNDO_KEY_DELETE);
install_undo_exec_hook(UNDO_KEY_DELETE_WITH_ROOT);
/* REDO_NEW_ROW_HEAD shares entry with REDO_INSERT_ROW_HEAD */
install_redo_exec_hook_shared(REDO_NEW_ROW_HEAD, REDO_INSERT_ROW_HEAD);
/* REDO_NEW_ROW_TAIL shares entry with REDO_INSERT_ROW_TAIL */
install_redo_exec_hook_shared(REDO_NEW_ROW_TAIL, REDO_INSERT_ROW_TAIL);
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
install_redo_exec_hook(UNDO_BULK_INSERT);
install_undo_exec_hook(UNDO_BULK_INSERT);
Fix for BUG#37876 "Importing Maria table from other server via binary copy does not work": - after auto-zerofill (ha_maria::check_and_repair()) kepts its state's LSNs unchanged, which could be the same as the create_rename_lsn of another pre-existing table, which would break versioning as this LSN serves as unique identifier in the versioning code (in maria_open()). Even the state pieces which maria_zerofill() did change were lost (because they didn't go to disk). - after this fix, if two tables were auto-zerofilled at the same time (by _ma_mark_changed()) they could receive the same create_rename_lsn, which would break versioning again. Fix is to write a log record each time a table is imported. - Print state's LSNs (create_rename_lsn, is_of_horizon, skip_redo_lsn) and UUID in maria_chk -dvv. mysql-test/r/maria-autozerofill.result: result mysql-test/t/maria-autozerofill.test: Test for auto-zerofilling storage/maria/ha_maria.cc: The state changes done by auto-zerofilling never reached disk. storage/maria/ma_check.c: When zerofilling a table, including its pages' LSNs, new state LSNs are needed next time the table is imported into a Maria instance. storage/maria/ma_create.c: Write LOGREC_IMPORTED_TABLE when importing a table. This is informative and ensures that the table gets a unique create_rename_lsn even though multiple tables are imported by concurrent threads (it advances the log's end LSN). storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: instead of using translog_get_horizon() for state's LSNs of imported table, use the LSN of to-be-written LOGREC_IMPORTED_TABLE. storage/maria/ma_loghandler.c: New type of log record storage/maria/ma_loghandler.h: New type of log record storage/maria/ma_loghandler_lsn.h: New name for constant as can be used not only by maria_chk but auto-zerofill now too. storage/maria/ma_open.c: instead of using translog_get_horizon() for state's LSNs of imported table, use the LSN of to-be-written LOGREC_IMPORTED_TABLE. storage/maria/ma_recovery.c: print content of LOGREC_IMPORTED_TABLE in maria_read_log. storage/maria/maria_chk.c: print info about LSNs of the table's state, and UUID, when maria_chk -dvv storage/maria/maria_pack.c: new name for constant storage/maria/unittest/ma_test_recovery.pl: Now that maria_chk -dvv shows state LSNs and UUID those need to be filtered out, as maria_read_log -a does not use the same as at original run.
2008-07-09 11:02:27 +02:00
install_redo_exec_hook(IMPORTED_TABLE);
install_redo_exec_hook(DEBUG_INFO);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
current_group_end_lsn= LSN_IMPOSSIBLE;
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
#ifndef DBUG_OFF
current_group_table= NULL;
#endif
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (unlikely(lsn == LSN_IMPOSSIBLE || lsn == translog_get_horizon()))
{
tprint(tracef, "checkpoint address refers to the log end log or "
"log is empty, nothing to do.\n");
DBUG_RETURN(0);
}
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
len= translog_read_record_header(lsn, &rec);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (len == RECHEADER_READ_ERROR)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read header of the first record.");
DBUG_RETURN(1);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
if (translog_scanner_init(lsn, 1, &scanner, 1))
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
tprint(tracef, "Scanner init failed\n");
DBUG_RETURN(1);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
for (i= 1;;i++)
{
uint16 sid= rec.short_trid;
const LOG_DESC *log_desc= &log_record_type_descriptor[rec.type];
display_record_position(log_desc, &rec, i);
/*
A complete group is a set of log records with an "end mark" record
(e.g. a set of REDOs for an operation, terminated by an UNDO for this
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
operation); if there is no "end mark" record the group is incomplete and
won't be executed.
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
*/
if ((log_desc->record_in_group == LOGREC_IS_GROUP_ITSELF) ||
(log_desc->record_in_group == LOGREC_LAST_IN_GROUP))
{
if (all_active_trans[sid].group_start_lsn != LSN_IMPOSSIBLE)
{
if (log_desc->record_in_group == LOGREC_IS_GROUP_ITSELF)
{
/*
WL#3072 - Maria recovery. * fix for bitmap vs checkpoint bug which could lead to corrupted tables in case of crashes at certain moments: a bitmap could be flushed to disk even though it was inconsistent with the log (it could be flushed before REDO-UNDO are written to the log). One bug remains, need code from others. Tests added. Fix is to pin unflushable bitmap pages, and let checkpoint wait for them to be flushable. * fix for long_trid!=0 assertion failure at Recovery. * less useless wakeups in the background flush|checkpoint thread. * store global_trid_generator in checkpoint record. mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery.test: make it easier to locate subtests storage/maria/ma_bitmap.c: When we send a bitmap to the pagecache, if this bitmap is not in a flushable state we keep it pinned and add it to a list, it will be unpinned when the bitmap is flushable again. A new function _ma_bitmap_flush_all() used by checkpoint. A new function _ma_bitmap_flushable() used by block format to signal when it starts modifying a bitmap and when it is done with it. storage/maria/ma_blockrec.c: When starting a row operation (insert/update/delete), mark that the bitmap is not flushable (because for example INSERT is going to over-allocate in the bitmap to prevent other threads from using our data pages). If a checkpoint comes at this moment it will wait for the bitmap to be flushable before flushing it. When the operation ends, bitmap becomes flushable again; that transition is done under the bitmap's mutex (needed for correct synchro with a concurrent checkpoint); but for INSERT/UPDATE this happens inside _ma_bitmap_release_unused() at a place where it already has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO of INSERT. In case of errors after setting the bitmap unflushable, we must always set it back to flushable or checkpoint would block. Debug possibilities to force a sleep while the bitmap is over-allocated. In case of error in get_head_or_tail() in allocate_and_write_block_record(), we still need to unpin all pages. Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong data_file_length. storage/maria/ma_blockrec.h: new bitmap calls. storage/maria/ma_checkpoint.c: filter_flush_indirect not needed anymore (flushing bitmap pages happens in _ma_bitmap_flush_all() now). So st_filter_param::is_data_file|pages_covered_by_bitmap not needed. Other filter_flush* don't need to flush bitmap anymore. Add debug possibility to flush all bitmap pages outside of a checkpoint, to simulate pagecache LRU eviction. When the background flush/checkpoint thread notices it has nothing to flush, it now sleeps directly until the next potential checkpoint moment instead of waking up every second. When in checkpoint we decide to not store a table in the checkpoint record (because it has logged no writes for example), we can also skip flushing this table. storage/maria/ma_commit.c: comment is out-of-date storage/maria/ma_key_recover.c: comment fix storage/maria/ma_loghandler.c: comment is out-of-date storage/maria/ma_open.c: comment is out-of-date storage/maria/ma_pagecache.c: comment for bug to fix. And we don't take checkpoints at end of REDO phase yet so can trust block->type. storage/maria/ma_recovery.c: Comments. Now-unneeded code for incomplete REDO-UNDO groups removed. When we forget about an old transaction we must really forget about it with bzero() (fixes the "long_trid!=0 assertion" recovery bug). When we delete a row with maria_delete() we turn on STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both cases a row was deleted). Pick up max_long_trid from the checkpoint record. storage/maria/maria_chk.c: comment storage/maria/maria_def.h: MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and 'pinned_pages'. storage/maria/trnman.c: I used to think that recovery only needs to know the maximum TrID of the lists of active and committed transactions. But no, sometimes both lists can even be empty and their TrID should not be reused. So Checkpoint now saves global_trid_generator in the checkpoint record. storage/maria/trnman_public.h: macros to read/store a TrID mysql-test/r/maria-recovery-bitmap.result: result is ok. Without the code fix, we would get a corruption message about the bitmap page in CHECK TABLE EXTENDED. mysql-test/t/maria-recovery-bitmap-master.opt: usual when we crash mysqld in tests mysql-test/t/maria-recovery-bitmap.test: test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
Can happen if the transaction got a table write error, then
unlocked tables thus wrote a COMMIT record. Or can be an
INCOMPLETE_GROUP record written by a previous recovery.
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
*/
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
tprint(tracef, "\nDiscarding incomplete group before this record\n");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
all_active_trans[sid].group_start_lsn= LSN_IMPOSSIBLE;
}
else
{
struct st_translog_scanner_data scanner2;
TRANSLOG_HEADER_BUFFER rec2;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
/*
There is a complete group for this transaction, containing more
than this event.
*/
tprint(tracef, " ends a group:\n");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
len=
translog_read_record_header(all_active_trans[sid].group_start_lsn,
&rec2);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (len < 0) /* EOF or error */
{
tprint(tracef, "Cannot find record where it should be\n");
goto err;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
if (lsn_end != LSN_IMPOSSIBLE && rec2.lsn >= lsn_end)
{
tprint(tracef,
"lsn_end reached at (%lu,0x%lx). "
"Skipping rest of redo entries",
LSN_IN_PARTS(rec2.lsn));
translog_destroy_scanner(&scanner);
translog_free_record_header(&rec);
DBUG_RETURN(0);
}
if (translog_scanner_init(rec2.lsn, 1, &scanner2, 1))
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
tprint(tracef, "Scanner2 init failed\n");
goto err;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
current_group_end_lsn= rec.lsn;
do
{
if (rec2.short_trid == sid) /* it's in our group */
{
const LOG_DESC *log_desc2= &log_record_type_descriptor[rec2.type];
display_record_position(log_desc2, &rec2, 0);
if (apply == MARIA_LOG_CHECK)
{
translog_size_t read_len;
enlarge_buffer(&rec2);
read_len=
translog_read_record(rec2.lsn, 0, rec2.record_length,
log_record_buffer.str, NULL);
if (read_len != rec2.record_length)
{
tprint(tracef, "Cannot read record's body: read %u of"
" %u bytes\n", read_len, rec2.record_length);
translog_destroy_scanner(&scanner2);
translog_free_record_header(&rec2);
goto err;
}
}
if (apply == MARIA_LOG_APPLY &&
display_and_apply_record(log_desc2, &rec2))
{
2007-10-01 08:59:05 +02:00
translog_destroy_scanner(&scanner2);
translog_free_record_header(&rec2);
goto err;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
translog_free_record_header(&rec2);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
len= translog_read_next_record_header(&scanner2, &rec2);
if (len < 0) /* EOF or error */
{
tprint(tracef, "Cannot find record where it should be\n");
translog_destroy_scanner(&scanner2);
translog_free_record_header(&rec2);
goto err;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
}
while (rec2.lsn < rec.lsn);
/* group finished */
all_active_trans[sid].group_start_lsn= LSN_IMPOSSIBLE;
current_group_end_lsn= LSN_IMPOSSIBLE; /* for debugging */
display_record_position(log_desc, &rec, 0);
translog_destroy_scanner(&scanner2);
translog_free_record_header(&rec2);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
}
if (apply == MARIA_LOG_APPLY &&
display_and_apply_record(log_desc, &rec))
goto err;
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
#ifndef DBUG_OFF
current_group_table= NULL;
#endif
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
else /* record does not end group */
{
/* just record the fact, can't know if can execute yet */
if (all_active_trans[sid].group_start_lsn == LSN_IMPOSSIBLE)
{
/* group not yet started */
all_active_trans[sid].group_start_lsn= rec.lsn;
}
}
translog_free_record_header(&rec);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
len= translog_read_next_record_header(&scanner, &rec);
if (len < 0)
{
switch (len)
{
case RECHEADER_READ_EOF:
tprint(tracef, "EOF on the log\n");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
break;
case RECHEADER_READ_ERROR:
tprint(tracef, "Error reading log\n");
goto err;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
break;
}
}
translog_destroy_scanner(&scanner);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
translog_free_record_header(&rec);
if (recovery_message_printed == REC_MSG_REDO)
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
{
fprintf(stderr, " 100%%");
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
fflush(stderr);
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 1;
}
DBUG_RETURN(0);
err:
translog_destroy_scanner(&scanner);
translog_free_record_header(&rec);
DBUG_RETURN(1);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
/**
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
@brief Informs about any aborted groups or uncommitted transactions,
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
prepares for the UNDO phase if needed.
@note Observe that it may init trnman.
*/
static uint end_of_redo_phase(my_bool prepare_for_undo_phase)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
uint sid, uncommitted= 0;
char llbuf[22];
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
LSN addr;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
hash_free(&all_dirty_pages);
/*
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
hash_free() can be called multiple times probably, but be safe if that
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
changes
*/
bzero(&all_dirty_pages, sizeof(all_dirty_pages));
my_free(dirty_pages_pool, MYF(MY_ALLOW_ZERO_PTR));
dirty_pages_pool= NULL;
llstr(max_long_trid, llbuf);
tprint(tracef, "Maximum transaction long id seen: %s\n", llbuf);
Store maximum transaction id into control file at clean shutdown. This can serve to maria_chk to check that trids found in rows and keys are not too big. Also used by Recovery when logs are lost. Options --require-control-file, --datadir, --log-dir (yes, the dashes are inconsistent but I imitated mysqld --datadir and --maria-log-dir) for maria_chk. Lock control file _before_ reading its content. storage/maria/ha_maria.cc: new prototype storage/maria/ma_check.c: A function to find the max trid in the system (consults transaction manager and control file), to check tables. storage/maria/ma_checkpoint.c: new prototype storage/maria/ma_control_file.c: Store max trid into control file, in a backward-compatible way (can still read old control files). Parameter to ma_control_file_open(), to not create the log if it's missing (maria_chk needs that). Lock control file _before_ reading its content. Fix for a segfault when reading an old control file (bzero() with a negative second argument) storage/maria/ma_control_file.h: changes to the control file module's API storage/maria/ma_init.c: When Maria shuts down cleanly, store max trid into control file. storage/maria/ma_loghandler.c: new prototype storage/maria/ma_recovery.c: During recovery, consult max trid stored in control file, in case it is bigger than what we found in log (case of logs manually removed by user). storage/maria/ma_test1.c: new prototype storage/maria/ma_test2.c: new prototype storage/maria/maria_chk.c: New option --require-control-file (abort if control file not found), --datadir (path for control file (and for logs if --log-dir not specified)), --log-dir (path for logs). Try to open control file when maria_chk starts. storage/maria/maria_read_log.c: new prototype storage/maria/trnman.c: A new function to know max trid in transaction manager storage/maria/trnman_public.h: New function storage/maria/unittest/ma_control_file-t.c: new prototypes. Testing storing and retrieving the max trid to/from control file storage/maria/unittest/ma_test_loghandler-t.c: new prototype storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype storage/maria/unittest/ma_test_loghandler_nologs-t.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype
2008-04-04 19:10:53 +02:00
llstr(max_trid_in_control_file, llbuf);
tprint(tracef, "Maximum transaction long id seen in control file: %s\n",
llbuf);
/*
If logs were deleted, or lost, trid in control file is needed to set
trnman's generator:
*/
set_if_bigger(max_long_trid, max_trid_in_control_file);
if (prepare_for_undo_phase && trnman_init(max_long_trid))
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
return -1;
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
trns_created= TRUE;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
for (sid= 0; sid <= SHORT_TRID_MAX; sid++)
{
TrID long_trid= all_active_trans[sid].long_trid;
LSN gslsn= all_active_trans[sid].group_start_lsn;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
TRN *trn;
if (gslsn != LSN_IMPOSSIBLE)
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
{
tprint(tracef, "Group at LSN (%lu,0x%lx) short_trid %u incomplete\n",
LSN_IN_PARTS(gslsn), sid);
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
all_active_trans[sid].group_start_lsn= LSN_IMPOSSIBLE;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (all_active_trans[sid].undo_lsn != LSN_IMPOSSIBLE)
{
llstr(long_trid, llbuf);
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
tprint(tracef, "Transaction long_trid %s short_trid %u uncommitted\n",
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
llbuf, sid);
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
/*
dummy_transaction_object serves only for DDLs, where there is never a
rollback or incomplete group. And unknown transactions (which have
long_trid==0) should have undo_lsn==LSN_IMPOSSIBLE.
*/
if (long_trid ==0)
{
eprint(tracef, "Transaction with long_trid 0 should not roll back");
ALERT_USER();
return -1;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (prepare_for_undo_phase)
{
if ((trn= trnman_recreate_trn_from_recovery(sid, long_trid)) == NULL)
return -1;
trn->undo_lsn= all_active_trans[sid].undo_lsn;
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
trn->first_undo_lsn= all_active_trans[sid].first_undo_lsn |
TRANSACTION_LOGGED_LONG_ID; /* because trn is known in log */
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
if (gslsn != LSN_IMPOSSIBLE)
{
/*
UNDO phase will log some records. So, a future recovery may see:
REDO(from incomplete group) - REDO(from rollback) - CLR_END
and thus execute the first REDO (finding it in "a complete
group"). To prevent that:
*/
Injecting more "const" declarations into code which does not change pointed data. I ran gcc -Wcast-qual on storage/maria, this identified un-needed casts, a couple of functions which said they had a const parameter though they changed the pointed content! This is fixed here. Some suspicious places receive a comment. The original intention of running -Wcast-qual was to find what code changes R-tree keys: I added const words, but hidden casts like those of int2store (casts target to (uint16*)) removed const checking; -Wcast-qual helped find those hidden casts. Log handler does not change the content pointed by LEX_STRING::str it receives, so we now use a struct which has a const inside, to emphasize this and be able to pass "const uchar*" buffers to log handler without fear of their content being changed by it. One-line fix for a merge glitch (when merging from MyISAM). include/m_string.h: As Maria's log handler uses LEX_STRING but never changes the content pointed by LEX_STRING::str, and assigns uchar* into this member most of the time, we introduce a new struct LEX_CUSTRING (C const U unsigned) for the log handler. include/my_global.h: In macros which read pointed content: use const pointers so that gcc -Wcast-qual does not warn about casting a const pointer to non-const. include/my_handler.h: In macros which read pointed content: use const pointers so that gcc -Wcast-qual does not warn about casting a const pointer to non-const. ha_find_null() does not change *a. include/my_sys.h: insert_dynamic() does not change *element. include/myisampack.h: In macros which read pointed content: use const pointers so that gcc -Wcast-qual does not warn about casting a const pointer to non-const. mysys/array.c: insert_dynamic() does not change *element mysys/my_handler.c: ha_find_null() does not change *a storage/maria/ma_bitmap.c: Log handler receives const strings now storage/maria/ma_blockrec.c: Log handler receives const strings now. _ma_apply_undo_row_delete/update() do change *header. storage/maria/ma_blockrec.h: correct prototype storage/maria/ma_check.c: Log handler receives const strings now. Un-needed casts storage/maria/ma_checkpoint.c: Log handler receives const strings now storage/maria/ma_checksum.c: unneeded cast storage/maria/ma_commit.c: Log handler receives const strings now storage/maria/ma_create.c: Log handler receives const strings now storage/maria/ma_dbug.c: fixing warning of gcc -Wcast-qual storage/maria/ma_delete.c: Log handler receives const strings now storage/maria/ma_delete_all.c: Log handler receives const strings now storage/maria/ma_delete_table.c: Log handler receives const strings now storage/maria/ma_dynrec.c: fixing some warnings of gcc -Wcast-qual. Unneeded casts removed. Comment about function which lies. storage/maria/ma_ft_parser.c: fix for warnings of gcc -Wcast-qual, removing unneeded casts storage/maria/ma_ft_update.c: less casts, comment storage/maria/ma_key.c: less casts, stay const (warnings of gcc -Wcast-qual) storage/maria/ma_key_recover.c: Log handler receives const strings now storage/maria/ma_loghandler.c: Log handler receives const strings now storage/maria/ma_loghandler.h: Log handler receives const strings now storage/maria/ma_loghandler_lsn.h: In macros which read pointed content: use const pointers so that gcc -Wcast-qual does not warn about casting a const pointer to non-const. storage/maria/ma_page.c: Log handler receives const strings now; more const storage/maria/ma_recovery.c: Log handler receives const strings now storage/maria/ma_rename.c: Log handler receives const strings now storage/maria/ma_rt_index.c: more const, to emphasize that functions don't change pointed content. best_key= NULL was forgotten during merge from MyISAM a few days ago, was causing a Valgrind warning storage/maria/ma_rt_index.h: new proto storage/maria/ma_rt_key.c: more const storage/maria/ma_rt_key.h: new proto storage/maria/ma_rt_mbr.c: more const for functions which deserve it storage/maria/ma_rt_mbr.h: new prototype storage/maria/ma_rt_split.c: make const what is not changed. storage/maria/ma_search.c: un-needed casts, more const storage/maria/ma_sp_key.c: more const storage/maria/ma_unique.c: un-needed casts. storage/maria/ma_write.c: Log handler receives const strings now storage/maria/maria_def.h: some more const storage/maria/unittest/ma_test_loghandler-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_multithread-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_noflush-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_nologs-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_purge-t.c: Log handler receives const strings now
2008-04-03 15:40:25 +02:00
LEX_CUSTRING log_array[TRANSLOG_INTERNAL_PARTS];
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
LSN lsn;
if (translog_write_record(&lsn, LOGREC_INCOMPLETE_GROUP,
trn, NULL, 0,
TRANSLOG_INTERNAL_PARTS, log_array,
NULL, NULL))
return -1;
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
uncommitted++;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
#ifdef MARIA_VERSIONING
/*
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
If real recovery: if transaction was committed, move it to some separate
list for soon purging.
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
*/
#endif
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
my_free(all_active_trans, MYF(MY_ALLOW_ZERO_PTR));
all_active_trans= NULL;
/*
The UNDO phase uses some normal run-time code of ROLLBACK: generates log
records, etc; prepare tables for that
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
*/
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
addr= translog_get_horizon();
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
for (sid= 0; sid <= SHARE_ID_MAX; sid++)
{
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
MARIA_HA *info= all_tables[sid].info;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info != NULL)
{
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
prepare_table_for_close(info, addr);
/*
But we don't close it; we leave it available for the UNDO phase;
it's likely that the UNDO phase will need it.
*/
if (prepare_for_undo_phase)
translog_assign_id_to_share_from_recovery(info->s, sid);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
}
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
return uncommitted;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
static int run_undo_phase(uint uncommitted)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
LSN last_undo;
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
DBUG_ENTER("run_undo_phase");
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
if (uncommitted > 0)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
WL#3071 - Maria checkpoint * Preparation for having a background checkpoint thread: frequency of checkpoint taken by that thread is now configurable by the user: global variable maria_checkpoint_frequency, in seconds, default 30 (checkpoint every 30th second); 0 means no checkpoints (and thus no background thread, thus no background flushing, that will probably only be used for testing). * Don't take checkpoints in Recovery if it didn't do anything significant; thus no checkpoint after a clean shutdown/restart. The only checkpoint which is never skipped is the one at shutdown. * fix for a test failure (after-merge fix) include/maria.h: new variable mysql-test/suite/rpl/r/rpl_row_flsh_tbls.result: result update mysql-test/suite/rpl/t/rpl_row_flsh_tbls.test: position update (=after merge fix, as this position was already changed into 5.1 and not merged here, causing test to fail) storage/maria/ha_maria.cc: Checkpoint's frequency is now configurable by the user: global variable maria_checkpoint_frequency. Changing it on the fly requires us to shutdown/restart the background checkpoint thread, as the loop done in that thread assumes a constant checkpoint interval. Default value is 30: a checkpoint every 30 seconds (yes, I know, physicists will remind that it should be named "period" then). ha_maria now asks for a background checkpoint thread when it starts, but this is still overruled (disabled) in ma_checkpoint_init(). storage/maria/ma_checkpoint.c: Checkpoint's frequency is now configurable by the user: background thread takes a checkpoint every maria_checkpoint_interval-th second. If that variable is 0, no checkpoints are taken. Note, I will enable the background thread only in a later changeset. storage/maria/ma_recovery.c: Don't take checkpoints at the end of the REDO phase and at the end of Recovery if Recovery didn't make anything significant (didn't open any tables, didn't rollback any transactions). With this, after a clean shutdown, Recovery shouldn't take any checkpoint, which makes starting faster (we save a few fsync()s of the log and control file).
2007-10-09 10:38:31 +02:00
checkpoint_useful= TRUE;
if (tracef != stdout)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
if (recovery_message_printed == REC_MSG_NONE)
WL#3071 Maria checkpoint, WL#3072 Maria recovery instead of fprintf(stderr) when a task (with no user connected) gets an error, use my_printf_error(). Flags ME_JUST_WARNING and ME_JUST_INFO added to my_error()/my_printf_error(), which pass it to my_message_sql() which is modified to call the appropriate sql_print_*(). This way recovery can signal its start and end with [Note] and not [ERROR] (but failure with [ERROR]). Recovery's detailed progress (percents etc) still uses stderr as they have to stay on one single line. sql_print_error() changed to use my_progname_short (nicer display). mysql-test-run.pl --gdb/--ddd does not run mysqld, because a breakpoint in mysql_parse is too late to debug startup problems; instead, dev should set the breakpoints it wants and then "run" ("r"). include/my_sys.h: new flags to tell error_handler_hook that this is not an error but an information or warning mysql-test/mysql-test-run.pl: when running with --gdb/--ddd to debug mysqld, breaking at mysql_parse is too late to debug startup problems; now, it does not run mysqld, does not set breakpoints, developer can set as early breakpoints as it wants and is responsible for typing "run" (or "r") mysys/my_init.c: set my_progname_short mysys/my_static.c: my_progname_short added sql/mysqld.cc: * my_message_sql() can now receive info or warning, not only error; this allows mysys to tell the user (or the error log if no user) about an info or warning. Used from Maria. * plugins (or engines like Maria) may want to call my_error(), so set up the error handler hook (my_message_sql) before initializing plugins; otherwise they get my_message_no_curses which is less integrated into mysqld (is just fputs()) * using my_progname_short instead of my_progname, in my_message_sql() (less space on screen) storage/maria/ma_checkpoint.c: fprintf(stderr) -> ma_message_no_user() storage/maria/ma_checkpoint.h: function for any Maria task, not connected to a user (example: checkpoint, recovery; soon could be deleted records purger) to report a message (calls my_printf_error() which, when inside ha_maria, leads to sql_print_*(), and when outside, leads to my_message_no_curses i.e. stderr). storage/maria/ma_recovery.c: To tell that recovery starts and ends we use ma_message_no_user() (sql_print_*() in practice). Detailed progress info still uses stderr as sql_print() cannot put several messages on one line. 071116 18:42:16 [Note] mysqld: Maria engine: starting recovery recovered pages: 0% 67% 100% (0.0 seconds); transactions to roll back: 1 0 (0.0 seconds); tables to flush: 1 0 (0.0 seconds); 071116 18:42:16 [Note] mysqld: Maria engine: recovery done storage/maria/maria_chk.c: my_progname_short moved to mysys storage/maria/maria_read_log.c: my_progname_short moved to mysys storage/myisam/myisamchk.c: my_progname_short moved to mysys
2007-11-16 17:09:51 +01:00
print_preamble();
fprintf(stderr, "transactions to roll back:");
recovery_message_printed= REC_MSG_UNDO;
}
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
tprint(tracef, "%u transactions will be rolled back\n", uncommitted);
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
procent_printed= 1;
for( ; ; )
{
char llbuf[22];
TRN *trn;
if (recovery_message_printed == REC_MSG_UNDO)
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
{
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
fprintf(stderr, " %u", uncommitted);
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
fflush(stderr);
}
WL#3072 Maria recovery: fix for bug: if a crash happened right after writing a REDO like this: REDO - UNDO - REDO*, then recovery would ignore the last REDO* (ok), rollback: REDO - UNDO - REDO* - REDO - CLR, and a next recovery would thus execute REDO* instead of skipping it again. Recovery now logs LOGREC_INCOMPLETE_GROUP when it meets REDO* for the first time, to draw a boundary and ensure it is always skipped. Tested by hand. Note: ma_test_all fails "maria_chk: error: Key 1 - Found too many records" not due to this patch (failed before). BitKeeper/triggers/post-commit: no truncation of the commit mail, or how to review patches? mysql-test/include/maria_verify_recovery.inc: let caller choose the statement used to crash (sometimes we want the crash to happen at special places) mysql-test/t/maria-recovery.test: user of maria_verify_recovery.inc now specifies statement which the script should use for crashing. storage/maria/ma_bitmap.c: it's easier to search for all places using functions from the bitmap module (like in ma_blockrec.c) if those exported functions all start with "_ma_bitmap": renaming some of them. Assertion that when we read a bitmap page, overwriting bitmap->map, we are not losing information (i.e. bitmap->changed is false). storage/maria/ma_blockrec.c: update to new names. Adding code (disabled, protected by a #ifdef) that I use to test certain crash scenarios (more to come). storage/maria/ma_blockrec.h: update to new names storage/maria/ma_checkpoint.c: update to new names storage/maria/ma_extra.c: update to new names storage/maria/ma_loghandler.c: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_loghandler.h: new LOGREC_INCOMPLETE_GROUP storage/maria/ma_recovery.c: When at the end of the REDO phase we have identified some transactions with incomplete REDO groups (REDOs without an UNDO or CLR_END), for each of them we log LOGREC_INCOMPLETE_GROUP. This way, the upcoming UNDO phase can write more records for such transaction, a future recovery won't pair the incomplete group with the CLR_END (as there is LOGREC_INCOMPLETE_GROUP to draw a boundary).
2007-12-10 23:26:53 +01:00
if ((uncommitted--) == 0)
break;
trn= trnman_get_any_trn();
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
DBUG_ASSERT(trn != NULL);
llstr(trn->trid, llbuf);
tprint(tracef, "Rolling back transaction of long id %s\n", llbuf);
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
last_undo= trn->undo_lsn + 1;
/* Execute all undo entries */
while (trn->undo_lsn)
{
TRANSLOG_HEADER_BUFFER rec;
LOG_DESC *log_desc;
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
DBUG_ASSERT(trn->undo_lsn < last_undo);
last_undo= trn->undo_lsn;
if (translog_read_record_header(trn->undo_lsn, &rec) ==
RECHEADER_READ_ERROR)
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
DBUG_RETURN(1);
log_desc= &log_record_type_descriptor[rec.type];
WL#3072 Maria Recovery misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed).
2007-09-06 16:04:36 +02:00
display_record_position(log_desc, &rec, 0);
if (log_desc->record_execute_in_undo_phase(&rec, trn))
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Got error %d when executing undo %s", my_errno,
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
log_desc->name);
translog_free_record_header(&rec);
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
DBUG_RETURN(1);
}
translog_free_record_header(&rec);
}
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
if (trnman_rollback_trn(trn))
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
DBUG_RETURN(1);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
/* We could want to span a few threads (4?) instead of 1 */
/* In the future, we want to have this phase *online* */
}
}
UNDO of rows now puts back all part of the row on their original pages and positions Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Added --debug-on option to mysqld (to be able to turn of DBUG with --debug-on=0) Fixed some bugs with 'non_flushable' marking of bitmap pages Don't use 'non_flushable' marking of bitmap pages for not transactional tables SHOW CREATE TABLE now shows if table was created with page checksums Fixed a lot of bugs with BLOB handling in case of update/REDO and UNDO More tests (especially for blobs) and DBUG_ASSERTS() More readable output from maria_read_log and maria_chk Fixed wrong shift that caused Maria to crash on files > 4G Mark tables as crashed of REDO fails dbug/dbug.c: Changed to use my_bool (allowed me to remove some windows specific code) Added variable _dbug_on_ to speed up execution when DBUG is not going to be used Removed initialization of variables if not needed include/my_dbug.h: Use my_bool for some functions that was defined as BOOLEAN in dbug.c code Added DBUGGER_ON/DEBUGGER_OFF to speed up execution when DBUG is not used include/my_global.h: Define my_bool early Increase MY_HOW_OFTEN_TO_WRITE as computers are now faster than 10 years ago mysql-test/mysql-test-run.pl: Added debug-on=0 to speed up tests mysql-test/r/maria-recovery.result: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/r/maria.result: Added testing of page checksums mysql-test/t/crash_commit_before-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-bitmap-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery-master.opt: Added --debug-on as test require DBUG to work mysql-test/t/maria-recovery.test: Added new test by Guilhem to test if UNDO_ROW_DELETE preserves rowid mysql-test/t/maria.test: Added testing of page checksums sql/mysqld.cc: Added --debug-on option (to be able to turn of DBUG with --debug-on=0) Indentation fixes Removed end spaces sql/sql_show.cc: Allow update_create_info() to inform MySQL if PACK_KEYS, NO_PACK_KEYS, CHECKSUM, PAGE_CHECKSUM or DELAY_KEY_WRITE is used storage/maria/Makefile.am: Added ma_test_big.sh storage/maria/ha_maria.cc: Store in create_info if page checksums are used (For SHOW CREATE TABLE) storage/maria/ma_bitmap.c: Added _ma_bitmap_wait_or_flush() to cause reader of bitmap pages to wait with reading until bitmap is flushed. Use TAIL_PAGE_COUNT_MARKER for tail pages Set 'sub_blocks' for and only for the head page or for the first extent of a blob. This is needed for store_extent_info() to be able to set START_EXTENT_BIT's Don't allocate more than 0x3ffff pages in one extent (We need bit 0x4000 as a START_EXTENT_BIT) Increase the calculated 'head_length' with the number of bytes used for extents. Update row->space_on_head_page also in _ma_bitmap_find_new_place() Make _ma_bitmap_get_page_bits() global. (Needed for UNDO handling) Changed _ma_bitmap_flushable() to take MARIA_HA instead of MARIA_SHARE. This was needed to be able to mark the handler if we had a 'non_flushable' call pending or not. Don't use 'non_flushable' marking of bitmap pages for not transactional tables. Added BLOCKUSED_USE_ORG_BITMAP handling also for tail pages. Added more DBUG_ASSERT() to find possible errors in other code Some code simplications by adding new local variables storage/maria/ma_blockrec.c: UNDO of rows now puts back all part of the row on their original pages and positions. Changed UNDO of DELETE and UNDO of UPDATE to contain information about the original length of data on head block and also extent information This changes a lot of logic as now an insert of a row on a page may happen to any position (and not just to the first or next free) Use PAGE_COUNT to mark if an extent is the start of of a blob. (Needed for extent_to_bitmap_blocks()) Added check_directory() for checking that directroy entries are correct. Added checking of row checksums when reading rows (with EXTRA_DEBUG) Added make_space_for_directory() and extend_directory() for doing expansion of directory Added get_rowpos_in_head_or_tail_page() to be able to store head/tail on original position in UNDO Added extent_to_bitmap_blocks() to be able to generate original bitmap blocks from UNDO entry Added _ma_update_at_original_place() for UNDO of DELETES Added row->min_length to hold minmum required space needed on head page Changed find_free_position() to use make_space_for_directory() Changed make_empty_page() to allow optional creation of directory entry Changed delete_head_or_tail() and _ma_apply_undo_row_isnert() to not copy pagecache block (speed optimization) Changed _ma_apply_redo_insert_row_head_or_tail() to be able to insert new row at any position on 'new' page Changed _ma_apply_undo_row_delete() and _ma_apply_undo_row_update() to put row in it's original position Ensure allocation of tail blocks are of at least MIN_TAIL_SIZE. Ensure we store pages in pinned pages even if read failed. (If not we will have pages pinned forever in page cache) Write original extent information in UNDO entry, not compacted ones (we need position to tails!) When setting BLOCKUSED_USED, don't clear other bits (we have to preserve BLOCKUSED_USE_ORG_BITMAP) Fixed som bugs in directory handling Fixed bug where we wrote wrong lsn to blob pages Added separate blob_buffer for fixing bug when updating row that had char/varchar that spanned several pages and also had blobs Ensure we call _ma_bitmap_flushable() also in case of errors When doing an update, first delete old entries, then search in bitmap for where to put new information Info->s -> share Rowid -> rowid More DBUG_ASSERT() storage/maria/ma_blockrec.h: Added START_EXTENT_BIT and TAIL_PAGE_COUNT_MARKER Added _ma_bitmap_wait_or_flush() and _ma_bitmap_get_page_bits() storage/maria/ma_check.c: Don't write extra empty line if there is no deleted blocks Ignore START_EXTENT_BIT's in page count Call _ma_fast_unlock_key_del() to free key_del link storage/maria/ma_close.c: Ensure that used_key_del is 0. (If not, someone forgot to call _ma_unlock_key_del()) storage/maria/ma_create.c: Changed constant to macro storage/maria/ma_delete.c: For deleted keys, log also position to row storage/maria/ma_extra.c: Release blob buffer at maria_reset() if bigger than MARIA_SMALL_BLOB_BUFFER storage/maria/ma_key_recover.c: Added bzero() of LSN that confused paged cache in case of uninitialized block Mark file crashed if applying of index changes fails Added calls to _ma_fast_unlock_key_del() for protection of shared key_del link. storage/maria/ma_locking.c: Added usage of MARIA_FILE_OPEN_COUNT_OFFSET Added _ma_mark_file_crashed() storage/maria/ma_loghandler.c: Fixed bug where we logged uninitialized memory storage/maria/ma_open.c: Moved state->changed to be at start of state info on disk to allow one to easly mark files as crashed storage/maria/ma_page.c: Disable 'dummy' checksumming of pages as this gave false warnings. (Need to investigate if this is ever needed) storage/maria/ma_pagecache.c: Fixed wrong shift that caused Maria to crash on files > 4G storage/maria/ma_recovery.c: In case of errors, start writing on new line if we where in %## %## printing mode (Made errors more readable) Changed global variable name from warnings -> recovery_warnings Use MARIA_FILE_CREATE_RENAME_LSN_OFFSET instead of constant Removed special handling of row position for deleted keys. Keys now always includes row positions _ma_apply_undo_row_delete() now gets page and row position Added check that we don't loop forever when handling undo's (in case of bug in undo chain) Print name of failed REDO/UNDO storage/maria/ma_recovery.h: Removed old comment storage/maria/ma_static.c: Chaned version number of Maria files to not accidently use old ones (becasue of change of ordering of status variables) storage/maria/ma_test2.c: Added option -u to specify number of rows to update Changed old option -u to be -A, as for ma_test1 Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) First created blob is now of max blob length to ensure we have at least one big blob in the table storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tests to use bigger blobs (not just 1K) Added new tests that tests recovery of update with blobs Removed comparision of .MAD file as it's not guranteed that recovery from scratch gives identical data file as original update (compact_page() may be called at different times during normal execution and during REDO) storage/maria/ma_update.c: Simplify code (changed * to if) storage/maria/maria_chk.c: Make output more readable storage/maria/maria_def.h: Changed 'changed' to int to prepare for more bits Added 2 more bytes to status information Added 'st_mara_row->min_length' for storing min length needed on head page Added 'st_mara_handler->blob_buff & blob_buff_size' for storing blobs Moved all tunning parameters into one block Added MARIA_SMALL_BLOB_BUFFER Added _ma_mark_file_crashed() storage/myisam/mi_test2.c: Fixed bug in update of rows with blobs (before blobs was always reset to empty on update) storage/maria/ma_test_big.sh: Testing of insert, update, delete, recovery and undo of rows with blobs Thanks to the random-ness of ma_test2 this is likely to find most bugs in the row handling
2007-12-30 21:40:03 +01:00
procent_printed= 0;
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
DBUG_RETURN(0);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
/**
In case of error in recovery, deletes all transactions from the transaction
manager so that this module does not assert.
@note no checkpoint should be taken as those transactions matter for the
next recovery (they still haven't been properly dealt with).
*/
static void delete_all_transactions()
{
for( ; ; )
{
TRN *trn= trnman_get_any_trn();
if (trn == NULL)
break;
trn->undo_lsn= trn->first_undo_lsn= LSN_IMPOSSIBLE;
trnman_rollback_trn(trn); /* ignore error */
}
}
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
/**
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
@brief re-enables transactionality, updates is_of_horizon
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
@param info table
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
@param horizon address to set is_of_horizon
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
*/
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
static void prepare_table_for_close(MARIA_HA *info, TRANSLOG_ADDRESS horizon)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
MARIA_SHARE *share= info->s;
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
/*
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
In a fully-forward REDO phase (no checkpoint record),
state is now at least as new as the LSN of the current record. It may be
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
newer, in case we are seeing a LOGREC_FILE_ID which tells us to close a
table, but that table was later modified further in the log.
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
But if we parsed a checkpoint record, it may be this way in the log:
FILE_ID(6->t2)... FILE_ID(6->t1)... CHECKPOINT(6->t1)
Checkpoint parsing opened t1 with id 6; first FILE_ID above is going to
make t1 close; the first condition below is however false (when checkpoint
was taken it increased is_of_horizon) and so it works. For safety we
add the second condition.
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
*/
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
if (cmp_translog_addr(share->state.is_of_horizon, horizon) < 0 &&
cmp_translog_addr(share->lsn_of_file_id, horizon) < 0)
{
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
share->state.is_of_horizon= horizon;
_ma_state_info_write_sub(share->kfile.file, &share->state,
MA_STATE_INFO_WRITE_DONT_MOVE_OFFSET);
}
WL#3138: Maria - fast "SELECT COUNT(*) FROM t;" and "CHECKSUM TABLE t" Added argument to maria_end_bulk_insert() to know if the table will be deleted after the operation Fixed wrong call to strmake Don't call bulk insert in case of inserting only one row (speed optimization as starting/stopping bulk insert Allow storing year 2155 in year field When running with purify/valgrind avoid copying structures over themself Added hook 'trnnam_end_trans_hook' that is called when transaction ends Added trn->used_tables that is used to an entry for all tables used by transaction Fixed that ndb doesn't crash on duplicate key error when start_bulk_insert/end_bulk_insert are not called include/maria.h: Added argument to maria_end_bulk_insert() to know if the table will be deleted after the operation include/my_tree.h: Added macro 'reset_free_element()' to be able to ignore calls to the external free function. Is used to optimize end-bulk-insert in case of failures, in which case we don't want write the remaining keys in the tree mysql-test/install_test_db.sh: Upgrade to new mysql_install_db options mysql-test/r/maria-mvcc.result: New tests mysql-test/r/maria.result: New tests mysql-test/suite/ndb/r/ndb_auto_increment.result: Fixed error message now when bulk insert is not always called mysql-test/suite/ndb/t/ndb_auto_increment.test: Fixed error message now when bulk insert is not always called mysql-test/t/maria-mvcc.test: Added testing of versioning of count(*) mysql-test/t/maria-page-checksum.test: Added comment mysql-test/t/maria.test: More tests mysys/hash.c: Code style change sql/field.cc: Allow storing year 2155 in year field sql/ha_ndbcluster.cc: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored sql/ha_ndbcluster.h: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored sql/ha_partition.cc: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored sql/ha_partition.h: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored sql/handler.cc: Don't call get_dup_key() if there is no table object. This can happen if the handler generates a duplicate key error on commit sql/handler.h: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored (ie, the table will be deleted) sql/item.cc: Style fix Removed compiler warning sql/log_event.cc: Added new argument to ha_end_bulk_insert() sql/log_event_old.cc: Added new argument to ha_end_bulk_insert() sql/mysqld.cc: Removed compiler warning sql/protocol.cc: Added DBUG sql/sql_class.cc: Added DBUG Fixed wrong call to strmake sql/sql_insert.cc: Don't call bulk insert in case of inserting only one row (speed optimization as starting/stopping bulk insert involves a lot of if's) Added new argument to ha_end_bulk_insert() sql/sql_load.cc: Added new argument to ha_end_bulk_insert() sql/sql_parse.cc: Style fixes Avoid goto in common senario sql/sql_select.cc: When running with purify/valgrind avoid copying structures over themself. This is not a real bug in itself, but it's a waste of cycles and causes valgrind warnings sql/sql_select.h: Avoid copying structures over themself. This is not a real bug in itself, but it's a waste of cycles and causes valgrind warnings sql/sql_table.cc: Call HA_EXTRA_PREPARE_FOR_DROP if table created by ALTER TABLE is going to be dropped Added new argument to ha_end_bulk_insert() storage/archive/ha_archive.cc: Added new argument to end_bulk_insert() storage/archive/ha_archive.h: Added new argument to end_bulk_insert() storage/federated/ha_federated.cc: Added new argument to end_bulk_insert() storage/federated/ha_federated.h: Added new argument to end_bulk_insert() storage/maria/Makefile.am: Added ma_state.c and ma_state.h storage/maria/ha_maria.cc: Versioning of count(*) and checksum - share->state.state is now assumed to be correct, not handler->state - Call _ma_setup_live_state() in external lock to get count(*)/checksum versioning. In case of not versioned and not concurrent insertable table, file->s->state.state contains the correct state information Other things: - file->s -> share - Added DBUG_ASSERT() for unlikely case - Optimized end_bulk_insert() to not write anything if table is going to be deleted (as in failed alter table) - Indentation changes in external_lock becasue of removed 'goto' caused a big conflict even if very little was changed storage/maria/ha_maria.h: New argument to end_bulk_insert() storage/maria/ma_blockrec.c: Update for versioning of count(*) and checksum Keep share->state.state.data_file_length up to date (not info->state->data_file_length) Moved _ma_block_xxxx_status() and maria_versioning() functions to ma_state.c storage/maria/ma_check.c: Update and use share->state.state instead of info->state info->s to share Update info->state at end of repair Call _ma_reset_state() to update share->state_history at end of repair storage/maria/ma_checkpoint.c: Call _ma_remove_not_visible_states() on checkpoint to clean up not visible state history from tables storage/maria/ma_close.c: Remember state history for running transaction even if table is closed storage/maria/ma_commit.c: Ensure we always call trnman_commit_trn() even if other calls fails. If we don't do that, the translog and state structures will not be freed storage/maria/ma_delete.c: Versioning of count(*) and checksum: - Always update info->state->checksum and info->state->records storage/maria/ma_delete_all.c: Versioning of count(*) and checksum: - Ensure that share->state.state is updated, as here is where we store the primary information storage/maria/ma_dynrec.c: Use lock_key_trees instead of concurrent_insert to check if trees should be locked. This allows us to lock trees both for concurrent_insert and for index versioning. storage/maria/ma_extra.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state - share->concurrent_insert -> share->non_transactional_concurrent_insert - Don't update share->state.state from info->state if transactional table Optimization: - Don't flush io_cache or bitmap if we are using FLUSH_IGNORE_CHANGED storage/maria/ma_info.c: Get most state information from current state storage/maria/ma_init.c: Add hash table and free function to store states for closed tables Install hook for transaction commit/rollback to update history state storage/maria/ma_key_recover.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state storage/maria/ma_locking.c: Versioning of count(*) and checksum: - Call virtual functions (if exists) to restore/update status - Move _ma_xxx_status() functions to ma_state.c info->s -> share storage/maria/ma_open.c: Versioning of count(*) and checksum: - For not transactional tables, set info->state to point to new allocated state structure. - Initialize new info->state_start variable that points to state at start of transaction - Copy old history states from hash table (maria_stored_states) first time the table is opened - Split flag share->concurrent_insert to non_transactional_concurrent_insert & lock_key_tree - For now, only enable versioning of tables without keys (to be fixed in soon!) - Added new virtual function to restore status in maria_lock_database) More DBUG storage/maria/ma_page.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state - Modify share->state.state.key_file_length under share->intern_lock storage/maria/ma_range.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees info->s -> share storage/maria/ma_recovery.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state - Update state information on close and when reenabling logging storage/maria/ma_rkey.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees storage/maria/ma_rnext.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees storage/maria/ma_rnext_same.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees - Only skip rows based on file length if non_transactional_concurrent_insert is set storage/maria/ma_rprev.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees storage/maria/ma_rsame.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees storage/maria/ma_sort.c: Use share->state.state instead of info->state Fixed indentation storage/maria/ma_static.c: Added maria_stored_state storage/maria/ma_update.c: Versioning of count(*) and checksum: - Always update info->state->checksum and info->state->records - Remove optimization for index file update as it doesn't work for transactional tables storage/maria/ma_write.c: Versioning of count(*) and checksum: - Always update info->state->checksum and info->state->records storage/maria/maria_def.h: Move MARIA_STATUS_INFO to ma_state.h Changes to MARIA_SHARE: - Added state_history to store count(*)/checksum states - Added in_trans as counter if table is used by running transactions - Split concurrent_insert into lock_key_trees and on_transactional_concurrent_insert. - Added virtual function lock_restore_status Changes to MARIA_HA: - save_state -> state_save - Added state_start to store state at start of transaction storage/maria/maria_pack.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state Indentation fixes storage/maria/trnman.c: Added hook 'trnnam_end_trans_hook' that is called when transaction ends Added trn->used_tables that is used to an entry for all tables used by transaction More DBUG Changed return type of trnman_end_trn() to my_bool Added trnman_get_min_trid() to get minimum trid in use. Added trnman_exists_active_transactions() to check if there exist a running transaction started between two commit id storage/maria/trnman.h: Added 'used_tables' Moved all pointers into same groups to get better memory alignment storage/maria/trnman_public.h: Added prototypes for new functions and variables Chagned return type of trnman_end_trn() to my_bool storage/myisam/ha_myisam.cc: Added argument to end_bulk_insert() if operation should be aborted storage/myisam/ha_myisam.h: Added argument to end_bulk_insert() if operation should be aborted storage/maria/ma_state.c: Functions to handle state of count(*) and checksum storage/maria/ma_state.h: Structures and declarations to handle state of count(*) and checksum
2008-05-29 17:33:33 +02:00
/*
Ensure that info->state is up to date as
_ma_renable_logging_for_table() is depending on this
*/
*info->state= info->s->state.state;
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
/*
This leaves PAGECACHE_PLAIN_PAGE pages into the cache, while the table is
going to switch back to transactional. So the table will be a mix of
pages, which is ok as long as we don't take any checkpoints until all
tables get closed at the end of the UNDO phase.
*/
_ma_reenable_logging_for_table(info, FALSE);
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
info->trn= NULL; /* safety */
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
static MARIA_HA *get_MARIA_HA_from_REDO_record(const
TRANSLOG_HEADER_BUFFER *rec)
{
uint16 sid;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
pgcache_page_no_t page;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
MARIA_HA *info;
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
MARIA_SHARE *share;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
char llbuf[22];
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
my_bool index_page_redo_entry= FALSE, page_redo_entry= FALSE;
LINT_INIT(page);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
print_redo_phase_progress(rec->lsn);
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
sid= fileid_korr(rec->header);
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
switch (rec->type) {
/* not all REDO records have a page: */
Fixes for redo/undo logging of key pages New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning
2007-11-20 16:42:16 +01:00
case LOGREC_REDO_INDEX_NEW_PAGE:
case LOGREC_REDO_INDEX:
case LOGREC_REDO_INDEX_FREE_PAGE:
index_page_redo_entry= 1;
/* Fall trough*/
case LOGREC_REDO_INSERT_ROW_HEAD:
case LOGREC_REDO_INSERT_ROW_TAIL:
case LOGREC_REDO_PURGE_ROW_HEAD:
case LOGREC_REDO_PURGE_ROW_TAIL:
case LOGREC_REDO_NEW_ROW_HEAD:
case LOGREC_REDO_NEW_ROW_TAIL:
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
case LOGREC_REDO_FREE_HEAD_OR_TAIL:
page_redo_entry= TRUE;
page= page_korr(rec->header + FILEID_STORE_SIZE);
llstr(page, llbuf);
break;
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
/*
For REDO_FREE_BLOCKS, no need to look at dirty pages list: it does not
read data pages, only reads/modifies bitmap page(s) which is cheap.
*/
default:
break;
}
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
tprint(tracef, " For table of short id %u", sid);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
info= all_tables[sid].info;
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
#ifndef DBUG_OFF
DBUG_ASSERT(current_group_table == NULL || current_group_table == info);
current_group_table= info;
#endif
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info == NULL)
{
tprint(tracef, ", table skipped, so skipping record\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
return NULL;
}
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
share= info->s;
Changed all file names in maria to LEX_STRING and removed some calls to strlen() Ensure that pagecache gives correct error number even if error for block happend mysys/my_pread.c: Indentation fix storage/maria/ha_maria.cc: filenames changed to be of type LEX_STRING storage/maria/ma_check.c: filenames changed to be of type LEX_STRING storage/maria/ma_checkpoint.c: filenames changed to be of type LEX_STRING storage/maria/ma_create.c: filenames changed to be of type LEX_STRING storage/maria/ma_dbug.c: filenames changed to be of type LEX_STRING storage/maria/ma_delete.c: filenames changed to be of type LEX_STRING storage/maria/ma_info.c: filenames changed to be of type LEX_STRING storage/maria/ma_keycache.c: filenames changed to be of type LEX_STRING storage/maria/ma_locking.c: filenames changed to be of type LEX_STRING storage/maria/ma_loghandler.c: filenames changed to be of type LEX_STRING storage/maria/ma_open.c: filenames changed to be of type LEX_STRING storage/maria/ma_pagecache.c: Store error number for last failed operation in the page block This should fix some asserts() when errno was not properly set after failure to read block in another thread storage/maria/ma_recovery.c: filenames changed to be of type LEX_STRING storage/maria/ma_update.c: filenames changed to be of type LEX_STRING storage/maria/ma_write.c: filenames changed to be of type LEX_STRING storage/maria/maria_def.h: filenames changed to be of type LEX_STRING storage/maria/maria_ftdump.c: filenames changed to be of type LEX_STRING storage/maria/maria_pack.c: filenames changed to be of type LEX_STRING
2008-08-25 13:49:47 +02:00
tprint(tracef, ", '%s'", share->open_file_name.str);
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
DBUG_ASSERT(in_redo_phase);
if (cmp_translog_addr(rec->lsn, share->lsn_of_file_id) <= 0)
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
{
/*
This can happen only if processing a record before the checkpoint
record.
id->name mapping is newer than REDO record: for sure the table subject
of the REDO has been flushed and forced (id re-assignment implies this);
REDO can be ignored (and must be, as we don't know what this subject
table was).
*/
DBUG_ASSERT(cmp_translog_addr(rec->lsn, checkpoint_start) < 0);
tprint(tracef, ", table's LOGREC_FILE_ID has LSN (%lu,0x%lx) more recent"
" than record, skipping record",
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
LSN_IN_PARTS(share->lsn_of_file_id));
return NULL;
}
if (cmp_translog_addr(rec->lsn, share->state.skip_redo_lsn) <= 0)
{
/* probably a bulk insert repair */
tprint(tracef, ", has skip_redo_lsn (%lu,0x%lx) more recent than"
" record, skipping record\n",
LSN_IN_PARTS(share->state.skip_redo_lsn));
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
return NULL;
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* detect if an open instance of a dropped table (internal bug) */
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
DBUG_ASSERT(share->last_version != 0);
if (page_redo_entry)
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
{
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
/*
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
Consult dirty pages list.
REDO_INSERT_ROW_BLOBS will consult list by itself, as it covers several
pages.
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
*/
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
tprint(tracef, " page %s", llbuf);
if (_ma_redo_not_needed_for_page(sid, rec->lsn, page,
index_page_redo_entry))
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
return NULL;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
/*
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
So we are going to read the page, and if its LSN is older than the
record's we will modify the page
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
*/
tprint(tracef, ", applying record\n");
- speed optimization: minimize writes to transactional Maria tables: don't write data pages, state, and open_count at the end of each statement. Data pages will be written by a background thread periodically. State will be written by Checkpoint periodically. open_count serves to detect when a table is potentially damaged due to an unclean mysqld stop, but thanks to recovery an unclean mysqld stop will be corrected and so open_count becomes useless. As state is written less often, it is often obsolete on disk, we thus should avoid to read it from disk. - by removing the data page writes above, it is necessary to put it back at the start of some statements like check, repair and delete_all. It was already necessary in fact (see ma_delete_all.c). - disabling CACHE INDEX on Maria tables for now (fixes crash of test 'key_cache' when run with --default-storage-engine=maria). - correcting some fishy code in maria_extra.c (we possibly could lose index pages when doing a DROP TABLE under Windows, in theory). storage/maria/ha_maria.cc: disable CACHE INDEX in Maria for now (there is a single cache for now), it crashes and it's not a priority storage/maria/ma_bitmap.c: debug message storage/maria/ma_check.c: The statement before maria_repair() may not flush state, so it needs to be done by maria_repair() (indeed this function uses maria_open(HA_OPEN_COPY) so reads state from disk, so needs to find it up-to-date on disk). For safety (but normally this is not needed) we remove index blocks out of the cache before repairing. _ma_flush_blocks() becomes _ma_flush_table_files_after_repair(): it now additionally flushes the data file and state and syncs files. As a side effect, the assertion "no WRITE_CACHE_USED" from _ma_flush_table_files() fired so we move all end_io_cache() done at the end of repair to before the calls to _ma_flush_table_files_after_repair(). storage/maria/ma_close.c: when closing a transactional table, we fsync it. But we need to do this only after writing its state. We need to write the state at close time only for transactional tables (the other tables do that at last unlock). Putting back the O_RDONLY||crashed condition which I had removed earlier. Unmap the file before syncing it (does not matter now as Maria does not use mmap) storage/maria/ma_delete_all.c: need to flush data pages before chsize-ing it. Was needed even when we flushed data pages at the end of each statement, because we didn't anyway do it if under LOCK TABLES: the change here thus fixes this bug: create table t(a int) engine=maria;lock tables t write; insert into t values(1);delete from t;unlock tables;check table t; "Size of datafile is: 16384 Should be: 8192" (an obsolete page went to disk after the chsize(), at unlock time). storage/maria/ma_extra.c: When doing share->last_version=0, we make the MARIA_SHARE-in-memory invisible to future openers, so need to have an up-to-date state on disk for them. The same way, future openers will reopen the data and index file, so they will not find our cached blocks, so we need to flush them to disk. In HA_EXTRA_FORCE_REOPEN, this probably happens naturally as all tables normally get closed, we however add a safety flush. In HA_EXTRA_PREPARE_FOR_RENAME, we need to do the flushing. On Windows we additionally need to close files. In HA_EXTRA_PREPARE_FOR_DROP, we don't need to flush anything but remove dirty cached blocks from memory. On Windows we need to close files. Closing files forces us to sync them before (requirement for transactional tables). For mutex reasons (don't lock intern_lock twice), we move maria_lock_database() and _ma_decrement_open_count() first in the list of operations. Flush also data file in HA_EXTRA_FLUSH. storage/maria/ma_locking.c: For transactional tables: - don't write data pages / state at unlock time; as a consequence, "share->changed=0" cannot be done. - don't write state in _ma_writeinfo() - don't maintain open_count on disk (Recovery corrects the table in case of crash anyway, and we gain speed by not writing open_count to disk), For non-transactional tables, flush the state at unlock only if the table was changed (optimization). Code which read the state from disk is relevant only with external locking, we disable it (if want to re-enable it, it shouldn't for transactional tables as state on disk may be obsolete (such tables does not flush state at unlock anymore). The comment "We have to flush the write cache" is now wrong because maria_lock_database(F_UNLCK) now happens before thr_unlock(), and we are not using external locking. storage/maria/ma_open.c: _ma_state_info_read() is only used in ma_open.c, making it static storage/maria/ma_recovery.c: set MARIA_SHARE::changed to TRUE when we are going to apply a REDO/UNDO, so that the state gets flushed at close. storage/maria/ma_test_recovery.expected: Changes introduced by this patch: - good: the "open" (table open, not properly closed) is gone, it was pointless for a recovered table - bad: stemming from different moments of writing the index's state probably (_ma_writeinfo() used to write the state after every row write in ma_test* programs, doesn't anymore as the table is transactional): some differences in indexes (not relevant as we don't yet have recovery for them); some differences in count of records (changed from a wrong value to another wrong value) (not relevant as we don't recover this count correctly yet anyway, though a patch will be pushed soon). storage/maria/ma_test_recovery: for repeatable output, no names of varying directories. storage/maria/maria_chk.c: function renamed storage/maria/maria_def.h: Function became local to ma_open.c. Function renamed.
2007-09-06 16:53:26 +02:00
_ma_writeinfo(info, WRITEINFO_UPDATE_KEYFILE); /* to flush state on close */
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
return info;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
static MARIA_HA *get_MARIA_HA_from_UNDO_record(const
TRANSLOG_HEADER_BUFFER *rec)
{
uint16 sid;
MARIA_HA *info;
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
MARIA_SHARE *share;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
sid= fileid_korr(rec->header + LSN_STORE_SIZE);
tprint(tracef, " For table of short id %u", sid);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
info= all_tables[sid].info;
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
#ifndef DBUG_OFF
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
DBUG_ASSERT(!in_redo_phase ||
current_group_table == NULL || current_group_table == info);
* WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet.
2007-11-13 17:12:29 +01:00
current_group_table= info;
#endif
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
if (info == NULL)
{
tprint(tracef, ", table skipped, so skipping record\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
return NULL;
}
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
share= info->s;
Changed all file names in maria to LEX_STRING and removed some calls to strlen() Ensure that pagecache gives correct error number even if error for block happend mysys/my_pread.c: Indentation fix storage/maria/ha_maria.cc: filenames changed to be of type LEX_STRING storage/maria/ma_check.c: filenames changed to be of type LEX_STRING storage/maria/ma_checkpoint.c: filenames changed to be of type LEX_STRING storage/maria/ma_create.c: filenames changed to be of type LEX_STRING storage/maria/ma_dbug.c: filenames changed to be of type LEX_STRING storage/maria/ma_delete.c: filenames changed to be of type LEX_STRING storage/maria/ma_info.c: filenames changed to be of type LEX_STRING storage/maria/ma_keycache.c: filenames changed to be of type LEX_STRING storage/maria/ma_locking.c: filenames changed to be of type LEX_STRING storage/maria/ma_loghandler.c: filenames changed to be of type LEX_STRING storage/maria/ma_open.c: filenames changed to be of type LEX_STRING storage/maria/ma_pagecache.c: Store error number for last failed operation in the page block This should fix some asserts() when errno was not properly set after failure to read block in another thread storage/maria/ma_recovery.c: filenames changed to be of type LEX_STRING storage/maria/ma_update.c: filenames changed to be of type LEX_STRING storage/maria/ma_write.c: filenames changed to be of type LEX_STRING storage/maria/maria_def.h: filenames changed to be of type LEX_STRING storage/maria/maria_ftdump.c: filenames changed to be of type LEX_STRING storage/maria/maria_pack.c: filenames changed to be of type LEX_STRING
2008-08-25 13:49:47 +02:00
tprint(tracef, ", '%s'", share->open_file_name.str);
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
if (cmp_translog_addr(rec->lsn, share->lsn_of_file_id) <= 0)
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
{
tprint(tracef, ", table's LOGREC_FILE_ID has LSN (%lu,0x%lx) more recent"
" than record, skipping record",
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
LSN_IN_PARTS(share->lsn_of_file_id));
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
return NULL;
}
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
if (in_redo_phase &&
cmp_translog_addr(rec->lsn, share->state.skip_redo_lsn) <= 0)
{
/* probably a bulk insert repair */
tprint(tracef, ", has skip_redo_lsn (%lu,0x%lx) more recent than"
" record, skipping record\n",
LSN_IN_PARTS(share->state.skip_redo_lsn));
return NULL;
}
DBUG_ASSERT(share->last_version != 0);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
_ma_writeinfo(info, WRITEINFO_UPDATE_KEYFILE); /* to flush state on close */
tprint(tracef, ", applying record\n");
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
return info;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
/**
@brief Parses checkpoint record.
Builds from it the dirty_pages list (a hash), opens tables and maps them to
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
their 2-byte IDs, recreates transactions (not real TRNs though).
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
@return LSN from where in the log the REDO phase should start
@retval LSN_ERROR error
@retval other ok
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
*/
static LSN parse_checkpoint_record(LSN lsn)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
Added --with-maria-tmp-tables (default one) to allow on to configure if Maria should be used for internal temporary tables Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables Fixed bug that caused update of big blobs to crash Use pagecache_page_no_t as type for pages (to get rid of compiler warnings) Added cast to get rid of compiler warning Fixed wrong types of variables and arguments that caused lost information Fixed wrong DBUG_ASSERT() that caused REDO of big blobs to fail Removed some historical ifdefs that caused problem with windows compilations BUILD/SETUP.sh: Added --with-maria-tmp-tables include/maria.h: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option include/my_base.h: Added comment mysql-test/r/maria-big.result: Added test that uses big blobs mysql-test/t/maria-big.test: Added test that uses big blobs sql/mysqld.cc: Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables sql/sql_class.h: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined sql/sql_select.cc: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined storage/maria/ha_maria.cc: Fixed compiler warnings reported by MCC - Fixed usage of wrong types that caused data loss - Changed parameter for rep_quick to my_bool - Added safe casts Fixed indentation storage/maria/ma_bitmap.c: Use pagecache_page_no_t as type for pages Fixed compiler warnings Fixed bug that caused update of big blobs to crash storage/maria/ma_blockrec.c: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option Fixed compiler warnings Fixed wrong DBUG_ASSERT() storage/maria/ma_blockrec.h: Use pagecache_page_no_t as type for pages storage/maria/ma_check.c: Fixed some wrong parameters where we didn't get all bits for test_flag Changed rep_quick to be of type my_bool Use pagecache_page_no_t as type for pages Added cast's to get rid of compiler warnings Changed type of record_pos to get rid of compiler warning storage/maria/ma_create.c: Added safe cast's to get rid of compiler warnings storage/maria/ma_dynrec.c: Fixed usage of wrong type storage/maria/ma_key.c: Fixed compiler warning storage/maria/ma_key_recover.c: Use pagecache_page_no_t as type for pages storage/maria/ma_loghandler_lsn.h: Added cast's to get rid of compiler warnings storage/maria/ma_page.c: Changed variable name from 'page' to 'pos' as it was an offset and not a page address Moved page_size inside block to get rid of compiler warning storage/maria/ma_pagecache.c: Fixed compiler warnings Replaced compile time assert with TODO storage/maria/ma_pagecache.h: Use pagecache_page_no_t as type for pages storage/maria/ma_pagecrc.c: Allow bitmap pages that is all zero storage/maria/ma_preload.c: Added cast to get rid of compiler warning storage/maria/ma_recovery.c: Changed types to get rid of compiler warnings Use bool for quick_repair to get rid of compiler warning Fixed some variables that was wrongly declared (not enough precission) Added cast to get rid of compiler warning storage/maria/ma_test2.c: Remove historical undefs storage/maria/maria_chk.c: Changed rep_quick to bool Fixed wrong parameter to maria_chk_data_link() storage/maria/maria_def.h: Use pagecache_page_no_t as type for pages storage/maria/maria_pack.c: Renamed isam -> maria storage/maria/plug.in: Added option --with-maria-tmp-tables storage/maria/trnman.c: Added cast to get rid of compiler warning storage/myisam/mi_test2.c: Remove historical undefs
2008-01-10 20:21:36 +01:00
ulong i;
ulonglong nb_dirty_pages;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
TRANSLOG_HEADER_BUFFER rec;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
TRANSLOG_ADDRESS start_address;
int len;
uint nb_active_transactions, nb_committed_transactions, nb_tables;
uchar *ptr;
LSN minimum_rec_lsn_of_active_transactions, minimum_rec_lsn_of_dirty_pages;
struct st_dirty_page *next_dirty_page_in_pool;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
tprint(tracef, "Loading data from checkpoint record at LSN (%lu,0x%lx)\n",
LSN_IN_PARTS(lsn));
if ((len= translog_read_record_header(lsn, &rec)) == RECHEADER_READ_ERROR)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
tprint(tracef, "Cannot find checkpoint record where it should be\n");
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
enlarge_buffer(&rec);
if (log_record_buffer.str == NULL ||
translog_read_record(rec.lsn, 0, rec.record_length,
log_record_buffer.str, NULL) !=
rec.record_length)
{
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
eprint(tracef, "Failed to read record");
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
ptr= log_record_buffer.str;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
start_address= lsn_korr(ptr);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
ptr+= LSN_STORE_SIZE;
tprint(tracef, "Checkpoint record has start_horizon at (%lu,0x%lx)\n",
LSN_IN_PARTS(start_address));
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
/* transactions */
nb_active_transactions= uint2korr(ptr);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
ptr+= 2;
tprint(tracef, "%u active transactions\n", nb_active_transactions);
minimum_rec_lsn_of_active_transactions= lsn_korr(ptr);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
ptr+= LSN_STORE_SIZE;
WL#3072 - Maria recovery. * fix for bitmap vs checkpoint bug which could lead to corrupted tables in case of crashes at certain moments: a bitmap could be flushed to disk even though it was inconsistent with the log (it could be flushed before REDO-UNDO are written to the log). One bug remains, need code from others. Tests added. Fix is to pin unflushable bitmap pages, and let checkpoint wait for them to be flushable. * fix for long_trid!=0 assertion failure at Recovery. * less useless wakeups in the background flush|checkpoint thread. * store global_trid_generator in checkpoint record. mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery.test: make it easier to locate subtests storage/maria/ma_bitmap.c: When we send a bitmap to the pagecache, if this bitmap is not in a flushable state we keep it pinned and add it to a list, it will be unpinned when the bitmap is flushable again. A new function _ma_bitmap_flush_all() used by checkpoint. A new function _ma_bitmap_flushable() used by block format to signal when it starts modifying a bitmap and when it is done with it. storage/maria/ma_blockrec.c: When starting a row operation (insert/update/delete), mark that the bitmap is not flushable (because for example INSERT is going to over-allocate in the bitmap to prevent other threads from using our data pages). If a checkpoint comes at this moment it will wait for the bitmap to be flushable before flushing it. When the operation ends, bitmap becomes flushable again; that transition is done under the bitmap's mutex (needed for correct synchro with a concurrent checkpoint); but for INSERT/UPDATE this happens inside _ma_bitmap_release_unused() at a place where it already has the mutex, so the only penalty (mutex adding) is in DELETE and UNDO of INSERT. In case of errors after setting the bitmap unflushable, we must always set it back to flushable or checkpoint would block. Debug possibilities to force a sleep while the bitmap is over-allocated. In case of error in get_head_or_tail() in allocate_and_write_block_record(), we still need to unpin all pages. Bugfix: _ma_apply_redo_insert_row_blobs() produced wrong data_file_length. storage/maria/ma_blockrec.h: new bitmap calls. storage/maria/ma_checkpoint.c: filter_flush_indirect not needed anymore (flushing bitmap pages happens in _ma_bitmap_flush_all() now). So st_filter_param::is_data_file|pages_covered_by_bitmap not needed. Other filter_flush* don't need to flush bitmap anymore. Add debug possibility to flush all bitmap pages outside of a checkpoint, to simulate pagecache LRU eviction. When the background flush/checkpoint thread notices it has nothing to flush, it now sleeps directly until the next potential checkpoint moment instead of waking up every second. When in checkpoint we decide to not store a table in the checkpoint record (because it has logged no writes for example), we can also skip flushing this table. storage/maria/ma_commit.c: comment is out-of-date storage/maria/ma_key_recover.c: comment fix storage/maria/ma_loghandler.c: comment is out-of-date storage/maria/ma_open.c: comment is out-of-date storage/maria/ma_pagecache.c: comment for bug to fix. And we don't take checkpoints at end of REDO phase yet so can trust block->type. storage/maria/ma_recovery.c: Comments. Now-unneeded code for incomplete REDO-UNDO groups removed. When we forget about an old transaction we must really forget about it with bzero() (fixes the "long_trid!=0 assertion" recovery bug). When we delete a row with maria_delete() we turn on STATE_NOT_OPTIMIZED_ROWS so we do the same when we see a CLR_END for an UNDO_ROW_INSERT or when we execute an UNDO_ROW_INSERT (in both cases a row was deleted). Pick up max_long_trid from the checkpoint record. storage/maria/maria_chk.c: comment storage/maria/maria_def.h: MARIA_FILE_BITMAP gets new members: 'flushable', 'bitmap_cond' and 'pinned_pages'. storage/maria/trnman.c: I used to think that recovery only needs to know the maximum TrID of the lists of active and committed transactions. But no, sometimes both lists can even be empty and their TrID should not be reused. So Checkpoint now saves global_trid_generator in the checkpoint record. storage/maria/trnman_public.h: macros to read/store a TrID mysql-test/r/maria-recovery-bitmap.result: result is ok. Without the code fix, we would get a corruption message about the bitmap page in CHECK TABLE EXTENDED. mysql-test/t/maria-recovery-bitmap-master.opt: usual when we crash mysqld in tests mysql-test/t/maria-recovery-bitmap.test: test of recovery problems specific of the bitmap pages.
2007-12-14 16:14:12 +01:00
max_long_trid= transid_korr(ptr);
ptr+= TRANSID_SIZE;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
/*
how much brain juice and discussions there was to come to writing this
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
line. It may make start_address slightly decrease (only by the time it
takes to write one or a few rows, roughly).
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
*/
tprint(tracef, "Checkpoint record has min_rec_lsn of active transactions"
" at (%lu,0x%lx)\n",
LSN_IN_PARTS(minimum_rec_lsn_of_active_transactions));
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
set_if_smaller(start_address, minimum_rec_lsn_of_active_transactions);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
for (i= 0; i < nb_active_transactions; i++)
{
uint16 sid= uint2korr(ptr);
TrID long_id;
LSN undo_lsn, first_undo_lsn;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
ptr+= 2;
long_id= uint6korr(ptr);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
ptr+= 6;
DBUG_ASSERT(sid > 0 && long_id > 0);
undo_lsn= lsn_korr(ptr);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
ptr+= LSN_STORE_SIZE;
first_undo_lsn= lsn_korr(ptr);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
ptr+= LSN_STORE_SIZE;
new_transaction(sid, long_id, undo_lsn, first_undo_lsn);
}
nb_committed_transactions= uint4korr(ptr);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
ptr+= 4;
tprint(tracef, "%lu committed transactions\n",
(ulong)nb_committed_transactions);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
/* no purging => committed transactions are not important */
ptr+= (6 + LSN_STORE_SIZE) * nb_committed_transactions;
/* tables */
nb_tables= uint4korr(ptr);
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
ptr+= 4;
tprint(tracef, "%u open tables\n", nb_tables);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
for (i= 0; i< nb_tables; i++)
{
char name[FN_REFLEN];
LSN first_log_write_lsn;
uint name_len;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
uint16 sid= uint2korr(ptr);
ptr+= 2;
DBUG_ASSERT(sid > 0);
first_log_write_lsn= lsn_korr(ptr);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
ptr+= LSN_STORE_SIZE;
name_len= strlen((char *)ptr) + 1;
strmake(name, (char *)ptr, sizeof(name)-1);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
ptr+= name_len;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
if (new_table(sid, name, first_log_write_lsn))
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
/* dirty pages */
nb_dirty_pages= uint8korr(ptr);
/* Ensure casts later will not loose significant bits. */
Fixed compiler warnings by adding casts and changing variable types Fixed bug that caused change_user.test to fail Fixed bug that caused mysql_client_test to fail include/myisam.h: Fixed prototypes mysql-test/r/create.result: Fix that test works even if Maria is not used for temporary tables mysql-test/t/create.test: Fix that test works even if Maria is not used for temporary tables sql/mysqld.cc: Fixed that default value of max_join_size is set correctly; It needs to match usage in set_var.cc sql/set_var.cc: Fixed test, now when max_join_size is initialized correctly sql/sql_select.cc: Fixed that one can compile without -DUSE_MARIA_FOR_TMP_TABLES storage/maria/ma_blockrec.c: Fixed compiler warnings by adding casts storage/maria/ma_checkpoint.c: Fixed compiler warnings by adding casts storage/maria/ma_create.c: Fixed compiler warnings by adding casts storage/maria/ma_delete_table.c: Fixed compiler warnings by adding casts storage/maria/ma_loghandler.c: Fixed compiler warnings by adding casts and changing types for variables Changed translog_new_page_header to use changing integer instead of calling time() as time() is a slow call and will give same results when calling many times withing one second storage/maria/ma_pagecrc.c: Fixed compiler warnings by adding casts storage/maria/ma_recovery.c: Fixed indentation storage/myisam/ha_myisam.cc: Fixed wrong types for variables Changed chk_data_link() and repair*() functions to take my_bool as argument storage/myisam/mi_check.c: Fixes to handle that param.test_flag is now ulonglong storage/myisam/myisamchk.c: Fixes to handle that param.test_flag is now ulonglong support-files/compiler_warnings.supp: Fixed line numbers
2008-01-11 18:39:43 +01:00
DBUG_ASSERT((nb_dirty_pages <= SIZE_T_MAX/sizeof(struct st_dirty_page)) &&
(nb_dirty_pages <= ULONG_MAX));
ptr+= 8;
Added --loose-skip-maria to MYSQLD_BOOTSTRAP_CMD to get bootstrap.test to work Allow one to run bootstrap even if --skip-maria is used (needed for bootstrap.test) Fixed lots of compiler warnings NOTE: maria-big and maria-recover tests failes becasue of bugs in transaction log handling. Sanja knows about this and is working on it! mysql-test/mysql-test-run.pl: Added --loose-skip-maria to MYSQLD_BOOTSTRAP_CMD to get bootstrap.test to work mysql-test/r/maria-recovery.result: Updated results mysql-test/t/bootstrap.test: Removed not needed empty line mysql-test/t/change_user.test: Fixed results for 32 bit systems mysql-test/t/maria-big.test: Only run this when you use --big mysql-test/t/maria-recovery.test: Added test case for recovery with big blobs mysys/my_uuid.c: Fixed compiler warning sql/mysqld.cc: Allow one to run bootstrap even if --skip-maria is used (needed for bootstrap.test) sql/set_var.cc: Compare max_join_size with ULONG_MAX instead of HA_POS_ERROR as we set max_join_size to ULONG_MAX by default storage/maria/ma_bitmap.c: Added __attribute((unused)) to fix compiler warning storage/maria/ma_blockrec.c: Added casts to remove compiler warnings Change variable types to avoid compiler warnings storage/maria/ma_check.c: Added casts to remove compiler warnings storage/maria/ma_checkpoint.c: Change variable types to avoid compiler warnings storage/maria/ma_create.c: Change variable types to avoid compiler warnings storage/maria/ma_delete.c: Added casts to remove compiler warnings storage/maria/ma_key_recover.c: Added casts to remove compiler warnings storage/maria/ma_loghandler.c: Moved initiazation of prev_buffer first as this could otherwise not be set in case of errors storage/maria/ma_page.c: Added casts to remove compiler warnings storage/maria/ma_pagecache.c: Added __attribute((unused)) to fix compiler warning storage/maria/ma_pagecrc.c: Added #ifndef DBUG_OFF to remove compiler warning storage/maria/ma_recovery.c: Added casts to remove compiler warnings storage/maria/ma_write.c: Added casts to remove compiler warnings storage/maria/maria_chk.c: Split long string into two to avoid compiler warnings storage/myisam/ft_boolean_search.c: Added LINT_INIT() to remove compiler warning support-files/compiler_warnings.supp: Suppress wrong compiler warning unittest/mytap/tap.c: Fixed declaration to match prototypes to remove compiler warnings
2008-01-11 00:47:52 +01:00
tprint(tracef, "%lu dirty pages\n", (ulong) nb_dirty_pages);
if (hash_init(&all_dirty_pages, &my_charset_bin, (ulong)nb_dirty_pages,
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
offsetof(struct st_dirty_page, file_and_page_id),
sizeof(((struct st_dirty_page *)NULL)->file_and_page_id),
NULL, NULL, 0))
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
dirty_pages_pool=
(struct st_dirty_page *)my_malloc((size_t)nb_dirty_pages *
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
sizeof(struct st_dirty_page),
MYF(MY_WME));
if (unlikely(dirty_pages_pool == NULL))
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
next_dirty_page_in_pool= dirty_pages_pool;
minimum_rec_lsn_of_dirty_pages= LSN_MAX;
if (maria_recovery_verbose)
tprint(tracef, "Table_id Is_index Page_id Rec_lsn\n");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
for (i= 0; i < nb_dirty_pages ; i++)
{
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
pgcache_page_no_t page_id;
LSN rec_lsn;
Windows fixes -new option WITH_MARIA_STORAGE_ENGINE for config.js -correct build errors -build test executables -downport changes for atomic functions from 5.2 -remove LOCK_uuid_generator from C++ files to avoid linker errors -new function my_uuid2str() BitKeeper/deleted/.del-x86-msvc.h: Delete: include/atomic/x86-msvc.h CMakeLists.txt: Windows fixes: -New option WITH_MARIA_STORAGE_ENGINE -Add unit tests include/Makefile.am: replace x86-msvc.h with generic-msvc.h include/config-win.h: my_chmod() support include/my_atomic.h: Downport my_atomic from 5.2 tree include/my_bit.h: Correct unresolved symbol errors on Windows include/my_pthread.h: pthread_mutex_unlock now returns 0 (was void previously) defined PTHREAD_STACK_MIN include/my_sys.h: New function my_uuid2str() define MY_UUID_STRING_LENGTH include/atomic/nolock.h: Downport my_atomic from 5.2 tree libmysqld/CMakeLists.txt: New option WITH_MARIA_STORAGE_ENGINE mysys/CMakeLists.txt: Add missing files mysys/lf_dynarray.c: Fix compiler errors on Windows mysys/my_getncpus.c: Windows port mysys/my_uuid.c: Windows fixes: there is no random() on Windows, use ANSI rand() New function my_uuid2str() mysys/my_winthread.c: Downport from 5.2 tree -Call my_thread_end() before pthread_exit() -Avoid crash if pthread_create is called with NULL attributes sql/CMakeLists.txt: Link mysqld with Maria storage engine sql/item_func.cc: Remove LOCK_uuid_generator from C++ to avoid linker errors. Use dedicated mutex for short uuids sql/item_strfunc.cc: Use my_uuid() and my_uuid2str() functions from mysys. sql/item_strfunc.h: Define MY_UUID_STRING_LENGTH in my_sys.h sql/mysql_priv.h: LOCK_uuid_generator must be declared as extern "C" sql/mysqld.cc: Init and destroy LOCK_uuid_short mutex storage/maria/CMakeLists.txt: -Use the same source files as in Makefile.am -Build test binaries storage/maria/ha_maria.cc: snprintf->my_snprintf storage/maria/lockman.c: Fix compiler error on Windows storage/maria/ma_check.c: Fix compiler error on Windows storage/maria/ma_loghandler.c: Fix compile errors my_open()/my_sync() do not work for directories on Windows storage/maria/ma_recovery.c: Fix compile error on Windows storage/maria/ma_test2.c: Rename variable to avoid naming conflict with Microsoft C runtime function storage/maria/ma_test3.c: Fix build errors on Windows storage/maria/tablockman.c: Fix build errors on Windows storage/maria/unittest/Makefile.am: Add CMakeLists.txt storage/maria/unittest/ma_pagecache_consist.c: Fix build errors on Windows remove loop from get_len() storage/maria/unittest/ma_pagecache_single.c: Fix build errors on Windows storage/maria/unittest/ma_test_loghandler-t.c: Windows fixes -Avoid division by 0 in expressions like x/(RAND_MAX/y), where y is larger than RAND_MAX(==0x7fff on Windows) storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Windows fixes -Avoid division by 0 in expressions like x/(RAND_MAX/y), where y is larger than RAND_MAX(==0x7fff on Windows) -remove loop in get_len() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Windows fixes -Avoid division by 0 in expressions like x/(RAND_MAX/y), where y is larger than RAND_MAX(==0x7fff on Windows) -remove loop in get_len() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Fix build errors on Windows storage/maria/unittest/test_file.c: Correct the code to get file size on Windows. stat() information can be outdated and thus cannot be trusted. On Vista,stat() returns file size=0 until the file is closed at the first time. storage/myisam/CMakeLists.txt: Fix compiler errors on Windows Build test executables storage/myisam/mi_test2.c: Rename variable to avoid naming conflict with Microsoft C runtime function storage/myisam/mi_test3.c: Fix build errors on Windows strings/CMakeLists.txt: Add missing file unittest/unit.pl: Windows: downport unittest changes from 5.2 bk tree unittest/mysys/Makefile.am: Windows: downport unittest changes from 5.2 bk tree unittest/mysys/my_atomic-t.c: Windows: downport unittest changes from 5.2 bk tree unittest/mytap/Makefile.am: Windows: downport unittest changes from 5.2 bk tree unittest/mytap/tap.c: Windows: downport unittest changes from 5.2 bk tree win/configure.js: Add WITH_MARIA_STORAGE_ENGINE configure option unittest/mytap/CMakeLists.txt: Add missing file unittest/mysys/CMakeLists.txt: Add missing file storage/maria/unittest/CMakeLists.txt: Add missing file BitKeeper/etc/ignore: Added comments maria-win.patch to the ignore list include/atomic/generic-msvc.h: Implement atomic operations with MSVC intrinsics
2008-01-10 13:21:53 +01:00
uint32 is_index;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
uint16 table_id= uint2korr(ptr);
ptr+= 2;
Windows fixes -new option WITH_MARIA_STORAGE_ENGINE for config.js -correct build errors -build test executables -downport changes for atomic functions from 5.2 -remove LOCK_uuid_generator from C++ files to avoid linker errors -new function my_uuid2str() BitKeeper/deleted/.del-x86-msvc.h: Delete: include/atomic/x86-msvc.h CMakeLists.txt: Windows fixes: -New option WITH_MARIA_STORAGE_ENGINE -Add unit tests include/Makefile.am: replace x86-msvc.h with generic-msvc.h include/config-win.h: my_chmod() support include/my_atomic.h: Downport my_atomic from 5.2 tree include/my_bit.h: Correct unresolved symbol errors on Windows include/my_pthread.h: pthread_mutex_unlock now returns 0 (was void previously) defined PTHREAD_STACK_MIN include/my_sys.h: New function my_uuid2str() define MY_UUID_STRING_LENGTH include/atomic/nolock.h: Downport my_atomic from 5.2 tree libmysqld/CMakeLists.txt: New option WITH_MARIA_STORAGE_ENGINE mysys/CMakeLists.txt: Add missing files mysys/lf_dynarray.c: Fix compiler errors on Windows mysys/my_getncpus.c: Windows port mysys/my_uuid.c: Windows fixes: there is no random() on Windows, use ANSI rand() New function my_uuid2str() mysys/my_winthread.c: Downport from 5.2 tree -Call my_thread_end() before pthread_exit() -Avoid crash if pthread_create is called with NULL attributes sql/CMakeLists.txt: Link mysqld with Maria storage engine sql/item_func.cc: Remove LOCK_uuid_generator from C++ to avoid linker errors. Use dedicated mutex for short uuids sql/item_strfunc.cc: Use my_uuid() and my_uuid2str() functions from mysys. sql/item_strfunc.h: Define MY_UUID_STRING_LENGTH in my_sys.h sql/mysql_priv.h: LOCK_uuid_generator must be declared as extern "C" sql/mysqld.cc: Init and destroy LOCK_uuid_short mutex storage/maria/CMakeLists.txt: -Use the same source files as in Makefile.am -Build test binaries storage/maria/ha_maria.cc: snprintf->my_snprintf storage/maria/lockman.c: Fix compiler error on Windows storage/maria/ma_check.c: Fix compiler error on Windows storage/maria/ma_loghandler.c: Fix compile errors my_open()/my_sync() do not work for directories on Windows storage/maria/ma_recovery.c: Fix compile error on Windows storage/maria/ma_test2.c: Rename variable to avoid naming conflict with Microsoft C runtime function storage/maria/ma_test3.c: Fix build errors on Windows storage/maria/tablockman.c: Fix build errors on Windows storage/maria/unittest/Makefile.am: Add CMakeLists.txt storage/maria/unittest/ma_pagecache_consist.c: Fix build errors on Windows remove loop from get_len() storage/maria/unittest/ma_pagecache_single.c: Fix build errors on Windows storage/maria/unittest/ma_test_loghandler-t.c: Windows fixes -Avoid division by 0 in expressions like x/(RAND_MAX/y), where y is larger than RAND_MAX(==0x7fff on Windows) storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Windows fixes -Avoid division by 0 in expressions like x/(RAND_MAX/y), where y is larger than RAND_MAX(==0x7fff on Windows) -remove loop in get_len() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Windows fixes -Avoid division by 0 in expressions like x/(RAND_MAX/y), where y is larger than RAND_MAX(==0x7fff on Windows) -remove loop in get_len() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Fix build errors on Windows storage/maria/unittest/test_file.c: Correct the code to get file size on Windows. stat() information can be outdated and thus cannot be trusted. On Vista,stat() returns file size=0 until the file is closed at the first time. storage/myisam/CMakeLists.txt: Fix compiler errors on Windows Build test executables storage/myisam/mi_test2.c: Rename variable to avoid naming conflict with Microsoft C runtime function storage/myisam/mi_test3.c: Fix build errors on Windows strings/CMakeLists.txt: Add missing file unittest/unit.pl: Windows: downport unittest changes from 5.2 bk tree unittest/mysys/Makefile.am: Windows: downport unittest changes from 5.2 bk tree unittest/mysys/my_atomic-t.c: Windows: downport unittest changes from 5.2 bk tree unittest/mytap/Makefile.am: Windows: downport unittest changes from 5.2 bk tree unittest/mytap/tap.c: Windows: downport unittest changes from 5.2 bk tree win/configure.js: Add WITH_MARIA_STORAGE_ENGINE configure option unittest/mytap/CMakeLists.txt: Add missing file unittest/mysys/CMakeLists.txt: Add missing file storage/maria/unittest/CMakeLists.txt: Add missing file BitKeeper/etc/ignore: Added comments maria-win.patch to the ignore list include/atomic/generic-msvc.h: Implement atomic operations with MSVC intrinsics
2008-01-10 13:21:53 +01:00
is_index= ptr[0];
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
ptr++;
Minor changes. New description in SHOW ENGINES for Maria. Test for BUG#34106 "auto_increment is reset to 1 when table is recovered from crash" (fixed by Monty yesterday) mysql-test/r/maria-recovery.result: result, which is correct (before pulling Monty's fix for BUG#34106, we got a warning about auto_increment in CHECK TABLE (done in maria-verify-recovery.inc), no AUTO_INCREMENT clause in SHOW CREATE TABLE, and a failure of the last INSERT. mysql-test/r/maria.result: result mysql-test/t/maria-recovery.test: Test for BUG#34106 mysql-test/t/maria.test: look at what is reported in SHOW ENGINES mysys/my_pread.c: changed my mind: if Count argument is >4GB, we'll surely see a segfault in the pread() call when it tries to read 4GB from memory, so no need to print it in ulonglong format (saves a function call). mysys/my_read.c: changed my mind: if Count argument is >4GB, we'll surely see a segfault in the pread() call when it tries to read 4GB from memory, so no need to print it in ulonglong format (saves a function call). mysys/my_write.c: changed my mind: if Count argument is >4GB, we'll surely see a segfault in the pread() call when it tries to read 4GB from memory, so no need to print it in ulonglong format (saves a function call). storage/maria/ha_maria.cc: Description representing the current reality. This can be changed later storage/maria/ma_page.c: When reading the new key_del from a page on disk, if there is a bug (like BUG#34062) this key_del could be wrong, we try to catch if it's out of the key file. storage/maria/ma_pagecache.c: - no truncation of page's number in DBUG_PRINT (useful for BUG#34062) - page_korr instead of uint5korr storage/maria/ma_recovery.c: page_korr instead of uint5korr storage/maria/plug.in: Description representing the current reality. This can be changed later.
2008-01-31 23:17:50 +01:00
page_id= page_korr(ptr);
ptr+= PAGE_STORE_SIZE;
rec_lsn= lsn_korr(ptr);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
ptr+= LSN_STORE_SIZE;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
if (new_page((is_index << 16) | table_id,
page_id, rec_lsn, next_dirty_page_in_pool++))
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
if (maria_recovery_verbose)
tprint(tracef, "%8u %8u %12lu %lu,0x%lx\n", (uint) table_id,
(uint) is_index, (ulong) page_id, LSN_IN_PARTS(rec_lsn));
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
set_if_smaller(minimum_rec_lsn_of_dirty_pages, rec_lsn);
}
/* after that, there will be no insert/delete into the hash */
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/*
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
sanity check on record (did we screw up with all those "ptr+=", did the
checkpoint write code and checkpoint read code go out of sync?).
*/
if (ptr != (log_record_buffer.str + log_record_buffer.length))
{
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
eprint(tracef, "checkpoint record corrupted\n");
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
return LSN_ERROR;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
/*
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
start_address is now from where the dirty pages list can be ignored.
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
Find LSN higher or equal to this TRANSLOG_ADDRESS, suitable for
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
translog_read_record() functions.
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
*/
start_address= checkpoint_start=
translog_next_LSN(start_address, LSN_IMPOSSIBLE);
tprint(tracef, "Checkpoint record start_horizon now adjusted to"
" LSN (%lu,0x%lx)\n", LSN_IN_PARTS(start_address));
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
if (checkpoint_start == LSN_IMPOSSIBLE)
{
/*
There must be a problem, as our checkpoint record exists and is >= the
address which is stored in its first bytes, which is >= start_address.
*/
return LSN_ERROR;
}
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
/* now, where the REDO phase should start reading log: */
tprint(tracef, "Checkpoint has min_rec_lsn of dirty pages at"
" LSN (%lu,0x%lx)\n", LSN_IN_PARTS(minimum_rec_lsn_of_dirty_pages));
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
set_if_smaller(start_address, minimum_rec_lsn_of_dirty_pages);
DBUG_PRINT("info",
("checkpoint_start: (%lu,0x%lx) start_address: (%lu,0x%lx)",
LSN_IN_PARTS(checkpoint_start), LSN_IN_PARTS(start_address)));
return start_address;
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
}
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
static int new_page(uint32 fileid, pgcache_page_no_t pageid, LSN rec_lsn,
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
struct st_dirty_page *dirty_page)
{
/* serves as hash key */
Fix for BUG#34114 "maria_chk reports false error when several tables on command-line" and BUG#34062 "Maria table corruption on master". Use 5 bytes (instead of 4) to store page's number in the checkpoint record, to allow bigger table (1PB with maria-block-size=1kB). Help pushbuild not run out of memory by moving the portion of maria-recovery.test which generates lots of data into a -big.test. mysql-test/r/maria-recovery.result: result moved mysql-test/t/maria-recovery.test: piece which generates much data moved to maria-recovery-big.test mysys/my_pread.c: To fix BUG#34062, where a 1.1TB file was generated due to a wrong pwrite offset, it was useful to not lose precision on 'offset' in DBUG_PRINT, so that the crazy value is visible. mysys/my_read.c: To fix BUG#34062, where a 1.1TB file was generated due to a wrong pwrite offset, it was useful to not lose precision on 'offset' in DBUG_PRINT, so that the crazy value is visible. mysys/my_write.c: To fix BUG#34062, where a 1.1TB file was generated due to a wrong pwrite offset, it was useful to not lose precision on 'offset' in DBUG_PRINT, so that the crazy value is visible. storage/maria/ha_maria.cc: When starting a bulk insert, we throw away dirty index pages from the cache. Unique (non disabled) key insertions thus read out-of-date pages from the disk leading to BUG#34062 "Maria table corruption on master": a DELETE in procedure viewer_sp() had deleted all rows of viewer_tbl2 one by one, putting index page 1 into key_del; that page was thrown away at start of INSERT SELECT, then the INSERT SELECT needed a page to insert keys, looked at key_del, found 1, read page 1 from disk, and its out-of-date content was used to set the new value of key_del (crazy value of 1TB), then a later insertion needed another index page, tried to read page at this crazy offset and failed, leading to corruption mark. The fix is to destroy out-of-date pages and make the state consistent with that, i.e. call maria_delete_all_rows(). storage/maria/ma_blockrec.c: Special hook for UNDO_BULK_INSERT storage/maria/ma_blockrec.h: special hook for UNDO_BULK_INSERT storage/maria/ma_check.c: Fix for BUG#34114 "maria_chk reports false error when several tables on command-line": if the Nth (on the command line) table was BLOCK_RECORD it would start checks by using the param->record_checksum computed by checks of table N-1. storage/maria/ma_delete_all.c: comment storage/maria/ma_loghandler.c: special hook for UNDO_BULK_INSERT storage/maria/ma_page.c: comment storage/maria/ma_pagecache.c: page number is 5 bytes in checkpoint record now (allows bigger tables) storage/maria/ma_recovery.c: page number is 5 bytes in checkpoint record now storage/maria/ma_recovery_util.c: page number is 5 bytes now storage/maria/ma_write.c: typo mysql-test/r/maria-recovery-big.result: result is correct mysql-test/t/maria-recovery-big-master.opt: usual options for recovery tests mysql-test/t/maria-recovery-big.test: Moving out the big blob test to a -big test (it exhausts memory when using /dev/shm on certain machines)
2008-01-29 22:20:59 +01:00
dirty_page->file_and_page_id= (((uint64)fileid) << 40) | pageid;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
dirty_page->rec_lsn= rec_lsn;
return my_hash_insert(&all_dirty_pages, (uchar *)dirty_page);
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
static int close_all_tables(void)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
int error= 0;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
uint count= 0;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
LIST *list_element, *next_open;
MARIA_HA *info;
TRANSLOG_ADDRESS addr;
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
DBUG_ENTER("close_all_tables");
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
pthread_mutex_lock(&THR_LOCK_maria);
if (maria_open_list == NULL)
goto end;
tprint(tracef, "Closing all tables\n");
if (tracef != stdout)
{
if (recovery_message_printed == REC_MSG_NONE)
WL#3071 Maria checkpoint, WL#3072 Maria recovery instead of fprintf(stderr) when a task (with no user connected) gets an error, use my_printf_error(). Flags ME_JUST_WARNING and ME_JUST_INFO added to my_error()/my_printf_error(), which pass it to my_message_sql() which is modified to call the appropriate sql_print_*(). This way recovery can signal its start and end with [Note] and not [ERROR] (but failure with [ERROR]). Recovery's detailed progress (percents etc) still uses stderr as they have to stay on one single line. sql_print_error() changed to use my_progname_short (nicer display). mysql-test-run.pl --gdb/--ddd does not run mysqld, because a breakpoint in mysql_parse is too late to debug startup problems; instead, dev should set the breakpoints it wants and then "run" ("r"). include/my_sys.h: new flags to tell error_handler_hook that this is not an error but an information or warning mysql-test/mysql-test-run.pl: when running with --gdb/--ddd to debug mysqld, breaking at mysql_parse is too late to debug startup problems; now, it does not run mysqld, does not set breakpoints, developer can set as early breakpoints as it wants and is responsible for typing "run" (or "r") mysys/my_init.c: set my_progname_short mysys/my_static.c: my_progname_short added sql/mysqld.cc: * my_message_sql() can now receive info or warning, not only error; this allows mysys to tell the user (or the error log if no user) about an info or warning. Used from Maria. * plugins (or engines like Maria) may want to call my_error(), so set up the error handler hook (my_message_sql) before initializing plugins; otherwise they get my_message_no_curses which is less integrated into mysqld (is just fputs()) * using my_progname_short instead of my_progname, in my_message_sql() (less space on screen) storage/maria/ma_checkpoint.c: fprintf(stderr) -> ma_message_no_user() storage/maria/ma_checkpoint.h: function for any Maria task, not connected to a user (example: checkpoint, recovery; soon could be deleted records purger) to report a message (calls my_printf_error() which, when inside ha_maria, leads to sql_print_*(), and when outside, leads to my_message_no_curses i.e. stderr). storage/maria/ma_recovery.c: To tell that recovery starts and ends we use ma_message_no_user() (sql_print_*() in practice). Detailed progress info still uses stderr as sql_print() cannot put several messages on one line. 071116 18:42:16 [Note] mysqld: Maria engine: starting recovery recovered pages: 0% 67% 100% (0.0 seconds); transactions to roll back: 1 0 (0.0 seconds); tables to flush: 1 0 (0.0 seconds); 071116 18:42:16 [Note] mysqld: Maria engine: recovery done storage/maria/maria_chk.c: my_progname_short moved to mysys storage/maria/maria_read_log.c: my_progname_short moved to mysys storage/myisam/myisamchk.c: my_progname_short moved to mysys
2007-11-16 17:09:51 +01:00
print_preamble();
for (count= 0, list_element= maria_open_list ;
list_element ; count++, (list_element= list_element->next))
;
fprintf(stderr, "tables to flush:");
recovery_message_printed= REC_MSG_FLUSH;
}
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
/*
Since the end of end_of_redo_phase(), we may have written new records
(if UNDO phase ran) and thus the state is newer than at
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
end_of_redo_phase(), we need to bump is_of_horizon again.
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
*/
addr= translog_get_horizon();
for (list_element= maria_open_list ; ; list_element= next_open)
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
{
if (recovery_message_printed == REC_MSG_FLUSH)
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
{
fprintf(stderr, " %u", count--);
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
fflush(stderr);
}
if (list_element == NULL)
break;
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
next_open= list_element->next;
info= (MARIA_HA*)list_element->data;
pthread_mutex_unlock(&THR_LOCK_maria); /* ok, UNDO phase not online yet */
WL#3071 Maria checkpoint Ability for flush_pagecache_blocks() to flush only certain pages of a file, as instructed by an option "filter" pointer-to-function argument; Checkpoint and background dirty page flushing use that to flush only pages which have been dirty for long enough and bitmap pages. Fix for a bug in flush_cached_blocks() (no idea if it could produce a bug in real life, but theoretically it is). Testing checkpoint in ma_test_recovery via ma_test1 and ma_test2. Background checkpoint & dirty pages flush thread is still disabled by default in ha_maria. mysql-test/r/maria.result: result update storage/maria/ha_maria.cc: blank after function comment storage/maria/ma_checkpoint.c: Using an enum instead of 0/1/2 (applying Sanja's review comments). The comment about "this is an horizon" can be removed as Sanja created translog_next_LSN() which parse_checkpoint_record() uses. Variables in ma_checkpoint_background() cannot be declared in the for() as their value must not be reset at each iteration! storage/maria/ma_pagecache.c: adding to flush_pagecache_blocks() optional arguments 'filter' (pointer to function) and 'filter_arg'; if filter!=NULL this function will be called for each block of the file and will reply if this block and following ones should be flushed or not (3 possible replies). Fixing a bug when flush_cached_blocks() skips a pinned page: it has to unset PCBLOCK_IN_FLUSH set by flush_pagecache_blocks_int(). storage/maria/ma_pagecache.h: flush_pagecache_blocks() is changed to take "filter" and "filter_arg" arguments. "filter", if it is not NULL, may return one value among enum pagecache_flush_filter_result. storage/maria/ma_recovery.c: open_count=0 when closing tables at the end of recovery. storage/maria/ma_test1.c: Optional checkpoints (-H#) at various stages (stages similar to --testflag), for testing of checkpoints. storage/maria/ma_test2.c: Optional checkpoints (-H#) at various stages (stages similar to -t), for testing of checkpoints. storage/maria/ma_test_recovery.expected: Result update: the results of the additional test run with -H# (checkpoints) are added here. They are exactly identical to without checkpoints except that the index's Root (printed by maria_chk) is more correct when using checkpoints. This is because checkpoint flushed the state, so it happens to be correct, while no-checkpoint does not flush the state, and recovery does not recover indexes so Root is never fixed. When we recover indices, this will go away. storage/maria/ma_test_recovery: We duplicate the loop of tests to add an additional run with checkpoints at various stages, to see if maria_read_log uses them fine.
2007-10-17 16:55:26 +02:00
/*
Tables which we see here are exactly those which were open at time of
crash. They might have open_count>0 as Checkpoint maybe flushed their
state while they were used. As Recovery corrected them, don't alarm the
user, don't ask for a table check:
*/
info->s->state.open_count= 0;
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
prepare_table_for_close(info, addr);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
error|= maria_close(info);
pthread_mutex_lock(&THR_LOCK_maria);
}
end:
pthread_mutex_unlock(&THR_LOCK_maria);
Fixed several bugs in page CRC handling - Ignore CRC errors in REDO for potential new pages - Ignore CRC errors when repairing tables - Don't do readcheck callback on read error - Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC - Check index page for length before calculating CRC to catch bad pages Fixed bugs where we used wrong file descriptor to read/write bitmaps Fixed wrong hash key in 'files_in_flush' Fixed wrong lock method when writing bitmap Fixed some wrong printf statements in check/repair that caused core dumps Fixed argument to translog_page_validator that cause reading of log files to fail Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. Use fast 'dummy' pagecheck callbacks for temporary tables Don't die silently if flush finds pinned pages Give error (for now) if one tries to create a transactional table with fulltext or spatial keys Removed some not needed calls to pagecache_file_init() Added checking of pagecache checksums to ma_test1 and ma_test2 More DBUG Fixed some DBUG_PRINT to be in line with rest of the code include/my_base.h: Added HA_ERR_INTERNAL_ERROR (used for flush with pinned pages) and HA_ERR_WRONG_CRC mysql-test/r/binlog_unsafe.result: Added missing DROP VIEW statement mysql-test/r/maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/r/ps_maria.result: Added TRANSACTIONAL=0 when testing with fulltext keys mysql-test/t/binlog_unsafe.test: Added missing DROP VIEW statement mysql-test/t/maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys Added test that verifies we can't yet create transactional test with fulltext or spatial keys mysql-test/t/ps_maria.test: Added TRANSACTIONAL=0 when testing with fulltext keys mysys/my_fopen.c: Fd: -> fd: mysys/my_handler.c: Added new error messages mysys/my_lock.c: Fd: -> fd: mysys/my_pread.c: Fd: -> fd: mysys/my_read.c: Fd: -> fd: mysys/my_seek.c: Fd: -> fd: mysys/my_sync.c: Fd: -> fd: mysys/my_write.c: Fd: -> fd: sql/mysqld.cc: Fixed wrong argument to my_uuid_init() sql/sql_plugin.cc: Unified DBUG_PRINT (for convert-dbug-for-diff) storage/maria/ma_bitmap.c: Fixed wrong lock method when writing bitmap Fixed valgrind error Use fast 'dummy' pagecheck callbacks for temporary tables Faster bitmap handling for non transational tables storage/maria/ma_blockrec.c: Fixed that bitmap reading is done with the correct filehandle Handle reading of pages with wrong CRC when page contect doesn't matter Use the page buffer also when we get WRONG CRC or FILE_TOO_SHORT. (Faster and fixed a couple of bugs) storage/maria/ma_check.c: Split long strings for readablity Fixed some wrong printf statements that caused core dumps Use bitmap.file for bitmaps Ignore pages with wrong CRC storage/maria/ma_close.c: More DBUG_PRINT storage/maria/ma_create.c: Give error (for now) if one tries to create a crash safe table with fulltext or spatial keys storage/maria/ma_key_recover.c: Ignore HA_ERR_WRONG_CRC for new pages info->s -> share Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_loghandler.c: Fixed argument to translog_page_validator() storage/maria/ma_open.c: Removed old VMS specific code Added function to setup pagecache callbacks Moved code around to set 'share->temporary' early Removed some not needed calls to pagecache_file_init() storage/maria/ma_page.c: Store number of bytes used for delete-linked key pages to be able to use standard index CRC for deleted key pages. storage/maria/ma_pagecache.c: Don't do readcheck callback on read error Reset PCBLOCK_ERROR in pagecache_unlock_by_link() if we write page Set my_errno to HA_ER_INTERNAL_ERROR if flush() finds pinned pages Don't die silently if flush finds pinned pages. Use correct file descriptor when flushing pages Fixed wrong hash key in 'files_in_flush'; This must be the file descriptor, not the PAGECACHE_FILE as there may be several PAGECACHE_FILE for same file descriptor More DBUG_PRINT storage/maria/ma_pagecrc.c: Removed inline from not tiny static function Set my_errno to HA_ERR_WRONG_CRC if we find page with wrong CRC (Otherwise my_errno may be 0, and a lot of other code will be confused) CRCerror -> error (to keep code uniform) Print crc with %lu, as in my_checksum() uchar* -> uchar * Check index page for length before calculating CRC to catch bad pages Added 'dummy' crc_check and filler functions that are used for temporary tables storage/maria/ma_recovery.c: More DBUG More message to users to give information what phase failed Better error message if recovery failed storage/maria/ma_test1.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/ma_test2.c: Added checking of page checksums (combined with 'c' to not have to add more test runs) storage/maria/maria_chk.c: Fixed wrong argument to _ma_check_print_error() storage/maria/maria_def.h: Added format information to _ma_check_print_xxxx functions uchar* -> uchar *
2007-12-18 02:21:32 +01:00
DBUG_RETURN(error);
WL#3072 Maria recovery * create page cache before initializing engine and not after, because Maria's recovery needs a page cache * make the creation of a bitmap page more crash-resistent * bugfix (see ma_blockrec.c) * back to old way: create an 8k bitmap page when creating table * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * maria_chk tags repaired table with a special LSN * reworking all around in ma_recovery.c (less duplication) mysys/my_realloc.c: noted an issue in my_realloc() sql/mysqld.cc: page cache needs to be created before engines are initialized, because Maria's initialization may do a recovery which needs the page cache. storage/maria/ha_maria.cc: update to new prototype storage/maria/ma_bitmap.c: when creating the first bitmap page we used chsize to 8192 bytes then pwrite (overwrite) the last 2 bytes (8191-8192). If crash between the two operations, this leaves a bitmap page full without its end marker. A later recovery may try to read this page and find it exists and misses a marker and conclude it's corrupted and fail. Changing the chsize to only 8190 bytes: recovery will then find the page is too short and recreate it entirely. storage/maria/ma_blockrec.c: Fix for a bug: when executing a REDO, if the data page is created, data_file_length was increased before _ma_bitmap_set(): _ma_bitmap_set() called _ma_read_bitmap_page() which, due to the increased data_file_length, expected to find a bitmap page on disk with a correct end marker; if the bitmap page didn't exist already in fact, this failed. Fixed by increasing data_file_length only after _ma_read_bitmap_page() has created the new bitmap page correctly. This bug could happen every time a REDO is about creating a new bitmap page. storage/maria/ma_check.c: empty data file has a bitmap page storage/maria/ma_control_file.c: useless parameter to ma_control_file_create_or_open(), just test if this is recovery. storage/maria/ma_control_file.h: new prototype storage/maria/ma_create.c: Back to how it was before: maria_create() creates an 8k bitmap page. Thus (bugfix) data_file_length needs to reflect this instead of being 0. storage/maria/ma_loghandler.c: as ma_test1 and ma_test2 now use real transactions and not dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always about real transactions, can assert this. A function for Recovery to assign a short id to a table. storage/maria/ma_loghandler.h: new function storage/maria/ma_loghandler_lsn.h: maria_chk tags repaired tables with this LSN storage/maria/ma_open.c: * enforce that DMLs on transactional tables use real transactions and not dummy_transaction_object. * test if table was repaired with maria_chk (which has to been seen as an import of an external table into the server), test validity of create_rename_lsn (header corruption detection) * comments. storage/maria/ma_recovery.c: * preparations for the UNDO phase: recreate TRNs * preparations for Checkpoint: list of dirty pages, testing of rec_lsn to know if page should be skipped during Recovery (unused in this patch as no Checkpoint module pushed yet) * reworking all around (less duplication) storage/maria/ma_recovery.h: a parameter to say if the UNDO phase should be skipped storage/maria/maria_chk.c: tag repaired tables with a special LSN storage/maria/maria_read_log.c: * update to new prototype * no UNDO phase in maria_read_log for now storage/maria/trnman.c: * a function for Recovery to create a transaction (TRN), needed in the UNDO phase * a function for Recovery to grab an existing transaction, needed in the UNDO phase (rollback all existing transactions) storage/maria/trnman_public.h: new functions
2007-08-29 16:43:01 +02:00
}
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
/**
@brief Close all table instances with a certain name which are present in
all_tables.
@param name Name of table
@param addr Log address passed to prepare_table_for_close()
*/
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
static my_bool close_one_table(const char *name, TRANSLOG_ADDRESS addr)
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
{
my_bool res= 0;
/* There are no other threads using the tables, so we don't need any locks */
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
struct st_table_for_recovery *internal_table, *end;
for (internal_table= all_tables, end= internal_table + SHARE_ID_MAX + 1;
internal_table < end ;
internal_table++)
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
{
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
MARIA_HA *info= internal_table->info;
Changed all file names in maria to LEX_STRING and removed some calls to strlen() Ensure that pagecache gives correct error number even if error for block happend mysys/my_pread.c: Indentation fix storage/maria/ha_maria.cc: filenames changed to be of type LEX_STRING storage/maria/ma_check.c: filenames changed to be of type LEX_STRING storage/maria/ma_checkpoint.c: filenames changed to be of type LEX_STRING storage/maria/ma_create.c: filenames changed to be of type LEX_STRING storage/maria/ma_dbug.c: filenames changed to be of type LEX_STRING storage/maria/ma_delete.c: filenames changed to be of type LEX_STRING storage/maria/ma_info.c: filenames changed to be of type LEX_STRING storage/maria/ma_keycache.c: filenames changed to be of type LEX_STRING storage/maria/ma_locking.c: filenames changed to be of type LEX_STRING storage/maria/ma_loghandler.c: filenames changed to be of type LEX_STRING storage/maria/ma_open.c: filenames changed to be of type LEX_STRING storage/maria/ma_pagecache.c: Store error number for last failed operation in the page block This should fix some asserts() when errno was not properly set after failure to read block in another thread storage/maria/ma_recovery.c: filenames changed to be of type LEX_STRING storage/maria/ma_update.c: filenames changed to be of type LEX_STRING storage/maria/ma_write.c: filenames changed to be of type LEX_STRING storage/maria/maria_def.h: filenames changed to be of type LEX_STRING storage/maria/maria_ftdump.c: filenames changed to be of type LEX_STRING storage/maria/maria_pack.c: filenames changed to be of type LEX_STRING
2008-08-25 13:49:47 +02:00
if ((info != NULL) && !strcmp(info->s->open_file_name.str, name))
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
{
prepare_table_for_close(info, addr);
if (maria_close(info))
res= 1;
WL#3072 Maria recovery Misc changes: - fix for benign Valgrind error, compiler warnings - fix for a segfault in execution of maria_delete_all_rows() and one when taking multiple checkpoints - fix for too paranoid assertion - adding ability to take checkpoints at the end of the REDO phase and at the end of recovery. - other minor changes storage/maria/ha_maria.cc: The checkpoint done after Recovery is finished, is moved to maria_recover(). storage/maria/ma_bitmap.c: fix for Valgrind error: the "shadow debug copy" of the bitmap page started unitialized and so ma_print_bitmap() would use it uninitialized storage/maria/ma_checkpoint.c: * reset pointers to NULL after freeing them, or we segfault at next checkpoint in my_realloc(). * fix for compiler warnings. storage/maria/ma_delete_all.c: info->trn is NULL for non-transactional tables storage/maria/ma_locking.c: correct assertion (it fired wrongly in execution of REDO_DROP_TABLE due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count() ->maria_lock_database(F_UNLCK); another solution would have been to not call _ma_decrement_open_count() (it's ok to have a wrong open count in a table which we are dropping), but the same problem would still exist for REDO_RENAME_TABLE. storage/maria/ma_loghandler.c: fail early if UNRECOVERABLE_ERROR storage/maria/ma_recovery.c: * new argument to maria_apply_log(): should it take checkpoints (at end of REDO phase and at the very end) or no. * moving the call to translog_next_LSN() into parse_checkpoint_record() ("hide the details"). * Refining an error detection for something which could happen if there is a checkpoint record in the log. * Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME), as it looks safer, and also changing how close_one_table() works: it now limits itself to scanning all_tables[], thus having one loopp instead of two, which should be faster (as a result, it does not close tables not registered in this array, which is ok as there should not be any). storage/maria/ma_recovery.h: new parameter storage/maria/maria_read_log.c: update to new prototype
2007-10-08 19:08:25 +02:00
internal_table->info= NULL;
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys REDO optimization (Bascily avoid moving blocks from/to pagecache) More command line arguments to maria_read_log Fixed recovery bug when recreating table sql/opt_range.cc: Remove SAFE_MODE for opt_range as it disables UPDATE to use keys storage/maria/ma_blockrec.c: REDO optimization Use new interface for pagecache_reads to avoid copying page buffers storage/maria/ma_loghandler.c: Patch from Sanja: - Added new parameter to translog_get_page to use direct links to pagecache - Changed scanner to be able to use direct links This avoids a lot of calls to bmove512() in page cache. storage/maria/ma_loghandler.h: Added direct link to pagecache objects storage/maria/ma_open.c: Added const to parameter Added missing braces storage/maria/ma_pagecache.c: From Sanja: - Added direct links to pagecache (from pagecache_read()) Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer From Monty: - Fixed arguments to init_page_cache to handle big page caches - Fixed compiler warnings - Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors storage/maria/ma_pagecache.h: Changed block numbers from int to long to be able to handle big page caches Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK storage/maria/ma_recovery.c: Fixed recovery bug when recreating table (table was kept open) Moved some variables to function start (portability) Added space to some print messages storage/maria/maria_chk.c: key_buffer_size -> page_buffer_size storage/maria/maria_def.h: Changed default page_buffer_size to 10M storage/maria/maria_read_log.c: Added more startup options: --version --undo (apply undo) --page_cache_size (to run with big cache sizes) --silent (to not get any output from --apply) storage/maria/unittest/ma_control_file-t.c: Fixed compiler warning storage/maria/unittest/ma_test_loghandler-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Added new argument to translog_init_scanner() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Added new argument to translog_init_scanner()
2007-09-27 13:18:28 +02:00
}
}
return res;
}
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
/**
Temporarily disables logging for this table.
If that makes the log incomplete, writes a LOGREC_INCOMPLETE_LOG to the log
to warn log readers.
@param info table
@param log_incomplete if that disabling makes the log incomplete
@note for example in the REDO phase we disable logging but that does not
make the log incomplete.
*/
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
void _ma_tmp_disable_logging_for_table(MARIA_HA *info,
my_bool log_incomplete)
{
MARIA_SHARE *share= info->s;
WL#3072 Maria Recovery All statements doing an implicit commit now also do one in Maria. This is useful because LOCK TABLES; REPAIR; crash; is not rollback-able, the implicit commit of REPAIR avoid that Recovery tries to rollback and fails. Fix for BUG#33827 "COMMIT AND CHAIN causes serious Valgrind error" (maybe not the definite one, depends on the assigned dev). mysql-test/t/maria-recovery.test: test of REPAIR's implicit commit. I cannot commit the result file because maria-recovery fails in vanilla tree (seen in pushbuild) but its new section looks like: repair table t1; Table Op Msg_type Msg_text mysqltest.t1 repair status OK insert into t1 values(2); select * from t1; a 1 2 3 SET SESSION debug="+d,maria_flush_whole_log,maria_flush_whole_page_cache,maria_crash"; * crashing mysqld intentionally set global maria_checkpoint_interval=1; ERROR HY000: Lost connection to MySQL server during query * recovery happens check table t1 extended; Table Op Msg_type Msg_text mysqltest.t1 check status OK * testing that checksum after recovery is as expected Checksum-check failure use mysqltest; select * from t1; a 1 3 Which is as it should be. sql/rpl_injector.cc: fix for BUG#33827 sql/sql_parse.cc: - All DDLs and mysql_admin_table() (REPAIR etc) use end_actrive_trans() to do an implicit commit so we add there an implicit commit of the Maria transaction. - Fix for BUG#33827 storage/maria/ha_maria.cc: - A method to do implicit commit in Maria - After an implicit commit, if it was under LOCK TABLES, the locked tables have a stale file->trn: update it. storage/maria/ha_maria.h: new static method storage/maria/ma_check.c: bugfix: this disabling of transactionality had the effect that if LOCK TABLES; REPAIR; INSERT then the INSERT ran non-transactional (so couldn't be undone in case of crash, if, by bad chance, its effect on pages went to disk). storage/maria/ma_checkpoint.c: indentation storage/maria/ma_recovery.c: dbug statements storage/maria/trnman.c: When doing an implicit commit we need to know the number of locked tables of the committed transaction and copy it to the new transaction storage/maria/trnman_public.h: prototype change
2008-01-11 22:48:54 +01:00
DBUG_ENTER("_ma_tmp_disable_logging_for_table");
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
if (log_incomplete)
{
uchar log_data[FILEID_STORE_SIZE];
Injecting more "const" declarations into code which does not change pointed data. I ran gcc -Wcast-qual on storage/maria, this identified un-needed casts, a couple of functions which said they had a const parameter though they changed the pointed content! This is fixed here. Some suspicious places receive a comment. The original intention of running -Wcast-qual was to find what code changes R-tree keys: I added const words, but hidden casts like those of int2store (casts target to (uint16*)) removed const checking; -Wcast-qual helped find those hidden casts. Log handler does not change the content pointed by LEX_STRING::str it receives, so we now use a struct which has a const inside, to emphasize this and be able to pass "const uchar*" buffers to log handler without fear of their content being changed by it. One-line fix for a merge glitch (when merging from MyISAM). include/m_string.h: As Maria's log handler uses LEX_STRING but never changes the content pointed by LEX_STRING::str, and assigns uchar* into this member most of the time, we introduce a new struct LEX_CUSTRING (C const U unsigned) for the log handler. include/my_global.h: In macros which read pointed content: use const pointers so that gcc -Wcast-qual does not warn about casting a const pointer to non-const. include/my_handler.h: In macros which read pointed content: use const pointers so that gcc -Wcast-qual does not warn about casting a const pointer to non-const. ha_find_null() does not change *a. include/my_sys.h: insert_dynamic() does not change *element. include/myisampack.h: In macros which read pointed content: use const pointers so that gcc -Wcast-qual does not warn about casting a const pointer to non-const. mysys/array.c: insert_dynamic() does not change *element mysys/my_handler.c: ha_find_null() does not change *a storage/maria/ma_bitmap.c: Log handler receives const strings now storage/maria/ma_blockrec.c: Log handler receives const strings now. _ma_apply_undo_row_delete/update() do change *header. storage/maria/ma_blockrec.h: correct prototype storage/maria/ma_check.c: Log handler receives const strings now. Un-needed casts storage/maria/ma_checkpoint.c: Log handler receives const strings now storage/maria/ma_checksum.c: unneeded cast storage/maria/ma_commit.c: Log handler receives const strings now storage/maria/ma_create.c: Log handler receives const strings now storage/maria/ma_dbug.c: fixing warning of gcc -Wcast-qual storage/maria/ma_delete.c: Log handler receives const strings now storage/maria/ma_delete_all.c: Log handler receives const strings now storage/maria/ma_delete_table.c: Log handler receives const strings now storage/maria/ma_dynrec.c: fixing some warnings of gcc -Wcast-qual. Unneeded casts removed. Comment about function which lies. storage/maria/ma_ft_parser.c: fix for warnings of gcc -Wcast-qual, removing unneeded casts storage/maria/ma_ft_update.c: less casts, comment storage/maria/ma_key.c: less casts, stay const (warnings of gcc -Wcast-qual) storage/maria/ma_key_recover.c: Log handler receives const strings now storage/maria/ma_loghandler.c: Log handler receives const strings now storage/maria/ma_loghandler.h: Log handler receives const strings now storage/maria/ma_loghandler_lsn.h: In macros which read pointed content: use const pointers so that gcc -Wcast-qual does not warn about casting a const pointer to non-const. storage/maria/ma_page.c: Log handler receives const strings now; more const storage/maria/ma_recovery.c: Log handler receives const strings now storage/maria/ma_rename.c: Log handler receives const strings now storage/maria/ma_rt_index.c: more const, to emphasize that functions don't change pointed content. best_key= NULL was forgotten during merge from MyISAM a few days ago, was causing a Valgrind warning storage/maria/ma_rt_index.h: new proto storage/maria/ma_rt_key.c: more const storage/maria/ma_rt_key.h: new proto storage/maria/ma_rt_mbr.c: more const for functions which deserve it storage/maria/ma_rt_mbr.h: new prototype storage/maria/ma_rt_split.c: make const what is not changed. storage/maria/ma_search.c: un-needed casts, more const storage/maria/ma_sp_key.c: more const storage/maria/ma_unique.c: un-needed casts. storage/maria/ma_write.c: Log handler receives const strings now storage/maria/maria_def.h: some more const storage/maria/unittest/ma_test_loghandler-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_multithread-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_noflush-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_nologs-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_purge-t.c: Log handler receives const strings now
2008-04-03 15:40:25 +02:00
LEX_CUSTRING log_array[TRANSLOG_INTERNAL_PARTS + 1];
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
LSN lsn;
Injecting more "const" declarations into code which does not change pointed data. I ran gcc -Wcast-qual on storage/maria, this identified un-needed casts, a couple of functions which said they had a const parameter though they changed the pointed content! This is fixed here. Some suspicious places receive a comment. The original intention of running -Wcast-qual was to find what code changes R-tree keys: I added const words, but hidden casts like those of int2store (casts target to (uint16*)) removed const checking; -Wcast-qual helped find those hidden casts. Log handler does not change the content pointed by LEX_STRING::str it receives, so we now use a struct which has a const inside, to emphasize this and be able to pass "const uchar*" buffers to log handler without fear of their content being changed by it. One-line fix for a merge glitch (when merging from MyISAM). include/m_string.h: As Maria's log handler uses LEX_STRING but never changes the content pointed by LEX_STRING::str, and assigns uchar* into this member most of the time, we introduce a new struct LEX_CUSTRING (C const U unsigned) for the log handler. include/my_global.h: In macros which read pointed content: use const pointers so that gcc -Wcast-qual does not warn about casting a const pointer to non-const. include/my_handler.h: In macros which read pointed content: use const pointers so that gcc -Wcast-qual does not warn about casting a const pointer to non-const. ha_find_null() does not change *a. include/my_sys.h: insert_dynamic() does not change *element. include/myisampack.h: In macros which read pointed content: use const pointers so that gcc -Wcast-qual does not warn about casting a const pointer to non-const. mysys/array.c: insert_dynamic() does not change *element mysys/my_handler.c: ha_find_null() does not change *a storage/maria/ma_bitmap.c: Log handler receives const strings now storage/maria/ma_blockrec.c: Log handler receives const strings now. _ma_apply_undo_row_delete/update() do change *header. storage/maria/ma_blockrec.h: correct prototype storage/maria/ma_check.c: Log handler receives const strings now. Un-needed casts storage/maria/ma_checkpoint.c: Log handler receives const strings now storage/maria/ma_checksum.c: unneeded cast storage/maria/ma_commit.c: Log handler receives const strings now storage/maria/ma_create.c: Log handler receives const strings now storage/maria/ma_dbug.c: fixing warning of gcc -Wcast-qual storage/maria/ma_delete.c: Log handler receives const strings now storage/maria/ma_delete_all.c: Log handler receives const strings now storage/maria/ma_delete_table.c: Log handler receives const strings now storage/maria/ma_dynrec.c: fixing some warnings of gcc -Wcast-qual. Unneeded casts removed. Comment about function which lies. storage/maria/ma_ft_parser.c: fix for warnings of gcc -Wcast-qual, removing unneeded casts storage/maria/ma_ft_update.c: less casts, comment storage/maria/ma_key.c: less casts, stay const (warnings of gcc -Wcast-qual) storage/maria/ma_key_recover.c: Log handler receives const strings now storage/maria/ma_loghandler.c: Log handler receives const strings now storage/maria/ma_loghandler.h: Log handler receives const strings now storage/maria/ma_loghandler_lsn.h: In macros which read pointed content: use const pointers so that gcc -Wcast-qual does not warn about casting a const pointer to non-const. storage/maria/ma_page.c: Log handler receives const strings now; more const storage/maria/ma_recovery.c: Log handler receives const strings now storage/maria/ma_rename.c: Log handler receives const strings now storage/maria/ma_rt_index.c: more const, to emphasize that functions don't change pointed content. best_key= NULL was forgotten during merge from MyISAM a few days ago, was causing a Valgrind warning storage/maria/ma_rt_index.h: new proto storage/maria/ma_rt_key.c: more const storage/maria/ma_rt_key.h: new proto storage/maria/ma_rt_mbr.c: more const for functions which deserve it storage/maria/ma_rt_mbr.h: new prototype storage/maria/ma_rt_split.c: make const what is not changed. storage/maria/ma_search.c: un-needed casts, more const storage/maria/ma_sp_key.c: more const storage/maria/ma_unique.c: un-needed casts. storage/maria/ma_write.c: Log handler receives const strings now storage/maria/maria_def.h: some more const storage/maria/unittest/ma_test_loghandler-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_multithread-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_noflush-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_nologs-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Log handler receives const strings now storage/maria/unittest/ma_test_loghandler_purge-t.c: Log handler receives const strings now
2008-04-03 15:40:25 +02:00
log_array[TRANSLOG_INTERNAL_PARTS + 0].str= log_data;
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
log_array[TRANSLOG_INTERNAL_PARTS + 0].length= sizeof(log_data);
translog_write_record(&lsn, LOGREC_INCOMPLETE_LOG,
Added --loose-skip-maria to MYSQLD_BOOTSTRAP_CMD to get bootstrap.test to work Allow one to run bootstrap even if --skip-maria is used (needed for bootstrap.test) Fixed lots of compiler warnings NOTE: maria-big and maria-recover tests failes becasue of bugs in transaction log handling. Sanja knows about this and is working on it! mysql-test/mysql-test-run.pl: Added --loose-skip-maria to MYSQLD_BOOTSTRAP_CMD to get bootstrap.test to work mysql-test/r/maria-recovery.result: Updated results mysql-test/t/bootstrap.test: Removed not needed empty line mysql-test/t/change_user.test: Fixed results for 32 bit systems mysql-test/t/maria-big.test: Only run this when you use --big mysql-test/t/maria-recovery.test: Added test case for recovery with big blobs mysys/my_uuid.c: Fixed compiler warning sql/mysqld.cc: Allow one to run bootstrap even if --skip-maria is used (needed for bootstrap.test) sql/set_var.cc: Compare max_join_size with ULONG_MAX instead of HA_POS_ERROR as we set max_join_size to ULONG_MAX by default storage/maria/ma_bitmap.c: Added __attribute((unused)) to fix compiler warning storage/maria/ma_blockrec.c: Added casts to remove compiler warnings Change variable types to avoid compiler warnings storage/maria/ma_check.c: Added casts to remove compiler warnings storage/maria/ma_checkpoint.c: Change variable types to avoid compiler warnings storage/maria/ma_create.c: Change variable types to avoid compiler warnings storage/maria/ma_delete.c: Added casts to remove compiler warnings storage/maria/ma_key_recover.c: Added casts to remove compiler warnings storage/maria/ma_loghandler.c: Moved initiazation of prev_buffer first as this could otherwise not be set in case of errors storage/maria/ma_page.c: Added casts to remove compiler warnings storage/maria/ma_pagecache.c: Added __attribute((unused)) to fix compiler warning storage/maria/ma_pagecrc.c: Added #ifndef DBUG_OFF to remove compiler warning storage/maria/ma_recovery.c: Added casts to remove compiler warnings storage/maria/ma_write.c: Added casts to remove compiler warnings storage/maria/maria_chk.c: Split long string into two to avoid compiler warnings storage/myisam/ft_boolean_search.c: Added LINT_INIT() to remove compiler warning support-files/compiler_warnings.supp: Suppress wrong compiler warning unittest/mytap/tap.c: Fixed declaration to match prototypes to remove compiler warnings
2008-01-11 00:47:52 +01:00
&dummy_transaction_object, info,
(translog_size_t) sizeof(log_data),
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
TRANSLOG_INTERNAL_PARTS + 1, log_array,
log_data, NULL);
}
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
/* if we disabled before writing the record, record wouldn't reach log */
share->now_transactional= FALSE;
/*
Reset state pointers. This is needed as in ALTER table we may do
commit fllowed by _ma_renable_logging_for_table and then
info->state may point to a state that was deleted by
_ma_trnman_end_trans_hook()
*/
share->state.common= *info->state;
info->state= &share->state.common;
info->switched_transactional= TRUE;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
/*
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
Some code in ma_blockrec.c assumes a trn even if !now_transactional but in
this case it only reads trn->rec_lsn, which has to be LSN_IMPOSSIBLE and
should be now. info->trn may be NULL in maria_chk.
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
*/
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
if (info->trn == NULL)
info->trn= &dummy_transaction_object;
DBUG_ASSERT(info->trn->rec_lsn == LSN_IMPOSSIBLE);
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
share->page_type= PAGECACHE_PLAIN_PAGE;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
/* Functions below will pick up now_transactional and change callbacks */
Disable logging of index pages during repair Fixed failure in unittest/ma_test_loghandler_pagecache-t Initialize pagecache callbacks explictily, not with pagecache_init(). This is to make things more readable and for the future to make more choices with callbacks storage/maria/ha_maria.cc: Disable logging of index pages during repair storage/maria/ma_bitmap.c: Initialize callbacks explictily, not with pagecache_init(), to make things more readable and for future to have more choices with callbacks Use new interface to flush logs from pagecache storage/maria/ma_check.c: Fixed test for wrong keyblocks Use default functions to setup callbacks for pagecache storage/maria/ma_loghandler.c: Use dummy functions for log flush callback (NULL doesn't work anymore) storage/maria/ma_open.c: Initialize callbacks explictily, not with pagecache_init(), to make things more readable and for future to have more choices with callbacks Prefix external functions with _ma_ storage/maria/ma_pagecache.c: Use new simpler interface to flush logs if needed storage/maria/ma_pagecache.h: Changed interface to a faster, simpler one to flush logs. Now we have a function that takes care of flushing logs, instead of a function to get lsn address storage/maria/ma_pagecrc.c: Add functions for flushing logs storage/maria/ma_recovery.c: Rename functions storage/maria/maria_chk.c: Use default functions to setup callbacks for pagecache storage/maria/maria_def.h: Prefixd global functions with _ma_ storage/maria/unittest/ma_pagecache_consist.c: Use dummy functions for log flush callback (NULL doesn't work anymore) storage/maria/unittest/ma_pagecache_single.c: Use dummy functions for log flush callback (NULL doesn't work anymore) storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Use maria_flush_log_for_page to flush log pages. Fixes failure in unittest
2008-01-02 17:27:24 +01:00
_ma_set_data_pagecache_callbacks(&info->dfile, share);
_ma_set_index_pagecache_callbacks(&share->kfile, share);
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
_ma_bitmap_set_pagecache_callbacks(&share->bitmap.file, share);
WL#3072 Maria Recovery All statements doing an implicit commit now also do one in Maria. This is useful because LOCK TABLES; REPAIR; crash; is not rollback-able, the implicit commit of REPAIR avoid that Recovery tries to rollback and fails. Fix for BUG#33827 "COMMIT AND CHAIN causes serious Valgrind error" (maybe not the definite one, depends on the assigned dev). mysql-test/t/maria-recovery.test: test of REPAIR's implicit commit. I cannot commit the result file because maria-recovery fails in vanilla tree (seen in pushbuild) but its new section looks like: repair table t1; Table Op Msg_type Msg_text mysqltest.t1 repair status OK insert into t1 values(2); select * from t1; a 1 2 3 SET SESSION debug="+d,maria_flush_whole_log,maria_flush_whole_page_cache,maria_crash"; * crashing mysqld intentionally set global maria_checkpoint_interval=1; ERROR HY000: Lost connection to MySQL server during query * recovery happens check table t1 extended; Table Op Msg_type Msg_text mysqltest.t1 check status OK * testing that checksum after recovery is as expected Checksum-check failure use mysqltest; select * from t1; a 1 3 Which is as it should be. sql/rpl_injector.cc: fix for BUG#33827 sql/sql_parse.cc: - All DDLs and mysql_admin_table() (REPAIR etc) use end_actrive_trans() to do an implicit commit so we add there an implicit commit of the Maria transaction. - Fix for BUG#33827 storage/maria/ha_maria.cc: - A method to do implicit commit in Maria - After an implicit commit, if it was under LOCK TABLES, the locked tables have a stale file->trn: update it. storage/maria/ha_maria.h: new static method storage/maria/ma_check.c: bugfix: this disabling of transactionality had the effect that if LOCK TABLES; REPAIR; INSERT then the INSERT ran non-transactional (so couldn't be undone in case of crash, if, by bad chance, its effect on pages went to disk). storage/maria/ma_checkpoint.c: indentation storage/maria/ma_recovery.c: dbug statements storage/maria/trnman.c: When doing an implicit commit we need to know the number of locked tables of the committed transaction and copy it to the new transaction storage/maria/trnman_public.h: prototype change
2008-01-11 22:48:54 +01:00
DBUG_VOID_RETURN;
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
}
/**
Re-enables logging for a table which had it temporarily disabled.
Only the thread which disabled logging is allowed to reenable it. Indeed,
re-enabling logging affects all open instances, one must have exclusive
access to the table to do that. In practice, the one which disables has
such access.
Fix for BUG#41159 "Maria: deadlock between checkpoint and maria_write() when extending data file". No testcase (concurrency, tested by pushbuild2). storage/maria/ha_maria.cc: a comment about what Sanja had discovered a while ago storage/maria/ma_bitmap.c: guard against concurrent flush of bitmap by checkpoint: we must have close_lock here storage/maria/ma_blockrec.c: comment fixed for new behaviour storage/maria/ma_checkpoint.c: Release intern_lock before flushing bitmap, or it deadlocks with allocate_and_write_block_record() when that function needs to increase the data file's length (that function makes bitmap non flushable, then wants intern_lock to increase data_file_length). The checkpoint section which looks at the share's content (bitmap, state) needs to be protected from the possible my_free-ing done by a concurrent maria_close(); intern_lock is not enough as both maria_close() and checkpoint now have to release it in the middle. So the protection is done with close_lock. in_checkpoint is now protected by close_lock in places where it was protected by intern_lock. storage/maria/ma_close.c: hold close_lock in maria_close() from start to end, to guard against checkpoint trying to flush bitmap while we have my_free'd its structures, for example. intern_lock was not enough as both maria_close() and checkpoint have to release it in the middle, to avoid deadlocks. storage/maria/ma_open.c: initialize new mutex storage/maria/ma_recovery.c: a comment about what Sanja had discovered a while ago storage/maria/maria_def.h: comment. new mutex protecting the close of a MARIA_SHARE, from _start_ to _end_ of it.
2008-12-09 10:56:02 +01:00
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
@param info table
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
@param flush_pages if function needs to flush pages first
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
*/
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
my_bool _ma_reenable_logging_for_table(MARIA_HA *info, my_bool flush_pages)
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
{
MARIA_SHARE *share= info->s;
WL#3072 Maria Recovery All statements doing an implicit commit now also do one in Maria. This is useful because LOCK TABLES; REPAIR; crash; is not rollback-able, the implicit commit of REPAIR avoid that Recovery tries to rollback and fails. Fix for BUG#33827 "COMMIT AND CHAIN causes serious Valgrind error" (maybe not the definite one, depends on the assigned dev). mysql-test/t/maria-recovery.test: test of REPAIR's implicit commit. I cannot commit the result file because maria-recovery fails in vanilla tree (seen in pushbuild) but its new section looks like: repair table t1; Table Op Msg_type Msg_text mysqltest.t1 repair status OK insert into t1 values(2); select * from t1; a 1 2 3 SET SESSION debug="+d,maria_flush_whole_log,maria_flush_whole_page_cache,maria_crash"; * crashing mysqld intentionally set global maria_checkpoint_interval=1; ERROR HY000: Lost connection to MySQL server during query * recovery happens check table t1 extended; Table Op Msg_type Msg_text mysqltest.t1 check status OK * testing that checksum after recovery is as expected Checksum-check failure use mysqltest; select * from t1; a 1 3 Which is as it should be. sql/rpl_injector.cc: fix for BUG#33827 sql/sql_parse.cc: - All DDLs and mysql_admin_table() (REPAIR etc) use end_actrive_trans() to do an implicit commit so we add there an implicit commit of the Maria transaction. - Fix for BUG#33827 storage/maria/ha_maria.cc: - A method to do implicit commit in Maria - After an implicit commit, if it was under LOCK TABLES, the locked tables have a stale file->trn: update it. storage/maria/ha_maria.h: new static method storage/maria/ma_check.c: bugfix: this disabling of transactionality had the effect that if LOCK TABLES; REPAIR; INSERT then the INSERT ran non-transactional (so couldn't be undone in case of crash, if, by bad chance, its effect on pages went to disk). storage/maria/ma_checkpoint.c: indentation storage/maria/ma_recovery.c: dbug statements storage/maria/trnman.c: When doing an implicit commit we need to know the number of locked tables of the committed transaction and copy it to the new transaction storage/maria/trnman_public.h: prototype change
2008-01-11 22:48:54 +01:00
DBUG_ENTER("_ma_reenable_logging_for_table");
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
if (share->now_transactional == share->base.born_transactional ||
!info->switched_transactional)
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
DBUG_RETURN(0);
info->switched_transactional= FALSE;
Bugs fixed: - If not in autocommit mode, delete rows one by one so that we can roll back if necessary - bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten - Fixed bug in bitmap handling when allocation tail pages - Ensure we reserve place for directory entry when calculation place for head and tail pages - Fixed wrong value in bitmap->size[0] - Fixed wrong assert in flush_log_for_bitmap - Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset - Mark new pages as changed (Required to get repair() to work) - Fixed problem with advancing log horizon pointer within one page bounds - Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() - Fixed bug in logging of rows with more than one big blob - Fixed DBUG_ASSERTS() in pagecache to allow change of WRITE_LOCK to READ_LOCK in unlock() calls - Flush pagecache when we change from logging to not logging (if not, pagecache code breaks) - Ensure my_errno is set on return from write/delete/update - Fixed bug when using FIELD_SKIP_PRESPACE New features: - mysql_fix_privilege_tables now first uses binaries and scripts from source distribution, then in installed distribution - Fix that optimize works for Maria tables - maria_check --zerofill now also clear freed blob pages - maria_check -di now prints more information about record page utilization Optimizations: - Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) - Simplify code to abort when we found optimal bit pattern - Skip also full head page bit patterns when searching for tail - Increase default repair buffer to 128M for maria_chk and maria_read_log - Increase default sort buffer for maria_chk to 64M - Increase size of sortbuffer and pagecache for mysqld to 64M - VARCHAR/CHAR fields are stored in increasing length order for BLOCK_RECORD tables Better reporting: - Fixed test of error condition for flush (for better error code) - More error messages to mysqld if Maria recovery fails - Always print warning if rows are deleted in repair - Added global function _db_force_flush() that is usable when doing debugging in gdb - Added call to my_debug_put_break_here() in case of some errors (for debugging) - Remove used testfiles in unittest as these was written in different directories depending on from where the test was started This should fix the bugs found when importing a big table with many varchars and one/many blobs to Maria dbug/dbug.c: Added global function _db_force_flush() that is usable when doing debugging in gdbine extra/replace.c: Fixed memory leak include/my_dbug.h: Prototype for _db_force_flush() include/my_global.h: Added stdarg.h as my_sys.h now depends on it. include/my_sys.h: Make my_dbug_put_break_here() a NOP if not DBUG build Added my_printv_error() include/myisamchk.h: Added entry 'lost' to be able to count space that is lost forever mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Reset autocommit after test New test to check if delete_all_rows is used (verified with --debug) mysys/my_error.c: Added my_printv_error() scripts/mysql_fix_privilege_tables.sh: First use binaries and scripts from source distribution, then in installed distribution This ensures that a development branch doesn't pick up wrong scripts) sql/mysqld.cc: Fix that one can break maria recovery with ^C when debugging sql/sql_class.cc: Removed #ifdef that has no effect (The preceeding DBUG_ASSERT() ensures that the following code will not be exectued) storage/maria/ha_maria.cc: Increase size of sortbuffer and pagecache to 64M Fix that optimize works for Maria tables Fixed DBUG_ASSERT() when enable_indexes failes for end_bulk_insert() If not in autocommit mode, delete rows one by one so that we can roll back if necessary Fixed variable comments storage/maria/ma_bitmap.c: More ASSERTS to detect overwrite of bitmap pages bitmap->used_size was not correctly set, which caused bitmap pages to be overwritten Ensure we reserve place for directory entry when calculation place for head and tail pages bitmap->size[0] should not include space for directory entry Simplify code to abort when we found optimal bit pattern Skip also full head page bit patterns when searching for tail (should speed up some common cases) Fixed bug in allocate_tail() when block->used was not aligned on 6 bytes Fixed wrong assert in flush_log_for_bitmap Fixed bug in _ma_bitmap_release_unused() where tail blocks could be wrongly reset storage/maria/ma_blockrec.c: Ensure my_errno is set on return Fixed not optimal setting of row->min_length if we don't have variable length fields Use pagecache_unlock_by_link() instead of pagecache_write() if possible. (Avoids a memory copy and a find_block) Added DBUG_ASSERT() if we read or write wrong VARCHAR data Added DBUG_ASSERT() to find out if row sizes are calculated wrong Fixed bug in logging of rows with more than one big blob storage/maria/ma_check.c: Disable logging while normal repair is done to avoid logging of index changes Fixed bug that caused CHECKSUM part of key page to be used Fixed that deleted of wrong records also works for BLOCK_RECORD Clear unallocated pages: - BLOB pages are not automaticly cleared on delete, so we need to use the bitmap to know if page is used or not Better error reporting More information about record page utilization Change printing of file position to printing of pages to make output more readable Always print warning if rows are deleted storage/maria/ma_create.c: Calculate share.base_max_pack_length more accurately for BLOCK_RECORD pages (for future) Fixed that FIELD_SKIP_PRESPACE is recorded as FIELD_NORMAL; Fixed bug where fields could be used in wrong order Store FIELD_SKIP_ZERO fields before CHAR and VARCHAR fields (optimization) Store other fields in length order (to get better utilization of head block) storage/maria/ma_delete.c: Ensure my_errno is set on return storage/maria/ma_dynrec.c: Indentation fix storage/maria/ma_locking.c: Set changed if open_count is counted down. (To avoid getting error "client is using or hasn't closed the table properly" with transactional tables storage/maria/ma_loghandler.c: Fixed problem with advancing log horizon pointer within one page bounds (Patch from Sanja) Added more DBUG Indentation fixes storage/maria/ma_open.c: Removed wrong casts storage/maria/ma_page.c: Fixed usage of PAGECACHE_LOCK_WRITE_UNLOCK with _ma_new() Mark new pages as changed (Required to get repair() to work) storage/maria/ma_pagecache.c: Fixed test of error condition for flush Fixed problem when using PAGECACHE_LOCK_WRITE_TO_READ with unlock() Added call to my_debug_put_break_here() in case of errors (for debugging) storage/maria/ma_pagecrc.c: Ensure we get same crc for 32 and 64 bit systems by forcing argument to maria_page_crc to uint32 storage/maria/ma_recovery.c: Call my_printv_error() from eprint() to get critical errors to mysqld log Removed \n from error strings to eprint() to get nicer output in mysqld Added simple test in _ma_reenable_logging_for_table() to not do any work if not needed storage/maria/ma_update.c: Ensure my_errno is set on return storage/maria/ma_write.c: Ensure my_errno is set on return storage/maria/maria_chk.c: Use DEBUGGER_OFF if --debug is not use (to get slightly faster execution for debug binaries) Added option --skip-safemalloc Don't write exponents for rec/key storage/maria/maria_def.h: Increase default repair buffer to 128M for maria_chk and maria_read_log Increase default sort buffer for maria_chk to 64M storage/maria/unittest/Makefile.am: Don't update files automaticly from bitkeeper storage/maria/unittest/ma_pagecache_consist.c: Remove testfile at end storage/maria/unittest/ma_pagecache_single.c: Remove testfile at end storage/maria/unittest/ma_test_all-t: More tests Safer checking if test caused error
2008-01-07 17:54:41 +01:00
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
if ((share->now_transactional= share->base.born_transactional))
{
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
share->page_type= PAGECACHE_LSN_PAGE;
WL#3138: Maria - fast "SELECT COUNT(*) FROM t;" and "CHECKSUM TABLE t" Added argument to maria_end_bulk_insert() to know if the table will be deleted after the operation Fixed wrong call to strmake Don't call bulk insert in case of inserting only one row (speed optimization as starting/stopping bulk insert Allow storing year 2155 in year field When running with purify/valgrind avoid copying structures over themself Added hook 'trnnam_end_trans_hook' that is called when transaction ends Added trn->used_tables that is used to an entry for all tables used by transaction Fixed that ndb doesn't crash on duplicate key error when start_bulk_insert/end_bulk_insert are not called include/maria.h: Added argument to maria_end_bulk_insert() to know if the table will be deleted after the operation include/my_tree.h: Added macro 'reset_free_element()' to be able to ignore calls to the external free function. Is used to optimize end-bulk-insert in case of failures, in which case we don't want write the remaining keys in the tree mysql-test/install_test_db.sh: Upgrade to new mysql_install_db options mysql-test/r/maria-mvcc.result: New tests mysql-test/r/maria.result: New tests mysql-test/suite/ndb/r/ndb_auto_increment.result: Fixed error message now when bulk insert is not always called mysql-test/suite/ndb/t/ndb_auto_increment.test: Fixed error message now when bulk insert is not always called mysql-test/t/maria-mvcc.test: Added testing of versioning of count(*) mysql-test/t/maria-page-checksum.test: Added comment mysql-test/t/maria.test: More tests mysys/hash.c: Code style change sql/field.cc: Allow storing year 2155 in year field sql/ha_ndbcluster.cc: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored sql/ha_ndbcluster.h: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored sql/ha_partition.cc: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored sql/ha_partition.h: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored sql/handler.cc: Don't call get_dup_key() if there is no table object. This can happen if the handler generates a duplicate key error on commit sql/handler.h: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored (ie, the table will be deleted) sql/item.cc: Style fix Removed compiler warning sql/log_event.cc: Added new argument to ha_end_bulk_insert() sql/log_event_old.cc: Added new argument to ha_end_bulk_insert() sql/mysqld.cc: Removed compiler warning sql/protocol.cc: Added DBUG sql/sql_class.cc: Added DBUG Fixed wrong call to strmake sql/sql_insert.cc: Don't call bulk insert in case of inserting only one row (speed optimization as starting/stopping bulk insert involves a lot of if's) Added new argument to ha_end_bulk_insert() sql/sql_load.cc: Added new argument to ha_end_bulk_insert() sql/sql_parse.cc: Style fixes Avoid goto in common senario sql/sql_select.cc: When running with purify/valgrind avoid copying structures over themself. This is not a real bug in itself, but it's a waste of cycles and causes valgrind warnings sql/sql_select.h: Avoid copying structures over themself. This is not a real bug in itself, but it's a waste of cycles and causes valgrind warnings sql/sql_table.cc: Call HA_EXTRA_PREPARE_FOR_DROP if table created by ALTER TABLE is going to be dropped Added new argument to ha_end_bulk_insert() storage/archive/ha_archive.cc: Added new argument to end_bulk_insert() storage/archive/ha_archive.h: Added new argument to end_bulk_insert() storage/federated/ha_federated.cc: Added new argument to end_bulk_insert() storage/federated/ha_federated.h: Added new argument to end_bulk_insert() storage/maria/Makefile.am: Added ma_state.c and ma_state.h storage/maria/ha_maria.cc: Versioning of count(*) and checksum - share->state.state is now assumed to be correct, not handler->state - Call _ma_setup_live_state() in external lock to get count(*)/checksum versioning. In case of not versioned and not concurrent insertable table, file->s->state.state contains the correct state information Other things: - file->s -> share - Added DBUG_ASSERT() for unlikely case - Optimized end_bulk_insert() to not write anything if table is going to be deleted (as in failed alter table) - Indentation changes in external_lock becasue of removed 'goto' caused a big conflict even if very little was changed storage/maria/ha_maria.h: New argument to end_bulk_insert() storage/maria/ma_blockrec.c: Update for versioning of count(*) and checksum Keep share->state.state.data_file_length up to date (not info->state->data_file_length) Moved _ma_block_xxxx_status() and maria_versioning() functions to ma_state.c storage/maria/ma_check.c: Update and use share->state.state instead of info->state info->s to share Update info->state at end of repair Call _ma_reset_state() to update share->state_history at end of repair storage/maria/ma_checkpoint.c: Call _ma_remove_not_visible_states() on checkpoint to clean up not visible state history from tables storage/maria/ma_close.c: Remember state history for running transaction even if table is closed storage/maria/ma_commit.c: Ensure we always call trnman_commit_trn() even if other calls fails. If we don't do that, the translog and state structures will not be freed storage/maria/ma_delete.c: Versioning of count(*) and checksum: - Always update info->state->checksum and info->state->records storage/maria/ma_delete_all.c: Versioning of count(*) and checksum: - Ensure that share->state.state is updated, as here is where we store the primary information storage/maria/ma_dynrec.c: Use lock_key_trees instead of concurrent_insert to check if trees should be locked. This allows us to lock trees both for concurrent_insert and for index versioning. storage/maria/ma_extra.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state - share->concurrent_insert -> share->non_transactional_concurrent_insert - Don't update share->state.state from info->state if transactional table Optimization: - Don't flush io_cache or bitmap if we are using FLUSH_IGNORE_CHANGED storage/maria/ma_info.c: Get most state information from current state storage/maria/ma_init.c: Add hash table and free function to store states for closed tables Install hook for transaction commit/rollback to update history state storage/maria/ma_key_recover.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state storage/maria/ma_locking.c: Versioning of count(*) and checksum: - Call virtual functions (if exists) to restore/update status - Move _ma_xxx_status() functions to ma_state.c info->s -> share storage/maria/ma_open.c: Versioning of count(*) and checksum: - For not transactional tables, set info->state to point to new allocated state structure. - Initialize new info->state_start variable that points to state at start of transaction - Copy old history states from hash table (maria_stored_states) first time the table is opened - Split flag share->concurrent_insert to non_transactional_concurrent_insert & lock_key_tree - For now, only enable versioning of tables without keys (to be fixed in soon!) - Added new virtual function to restore status in maria_lock_database) More DBUG storage/maria/ma_page.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state - Modify share->state.state.key_file_length under share->intern_lock storage/maria/ma_range.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees info->s -> share storage/maria/ma_recovery.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state - Update state information on close and when reenabling logging storage/maria/ma_rkey.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees storage/maria/ma_rnext.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees storage/maria/ma_rnext_same.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees - Only skip rows based on file length if non_transactional_concurrent_insert is set storage/maria/ma_rprev.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees storage/maria/ma_rsame.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees storage/maria/ma_sort.c: Use share->state.state instead of info->state Fixed indentation storage/maria/ma_static.c: Added maria_stored_state storage/maria/ma_update.c: Versioning of count(*) and checksum: - Always update info->state->checksum and info->state->records - Remove optimization for index file update as it doesn't work for transactional tables storage/maria/ma_write.c: Versioning of count(*) and checksum: - Always update info->state->checksum and info->state->records storage/maria/maria_def.h: Move MARIA_STATUS_INFO to ma_state.h Changes to MARIA_SHARE: - Added state_history to store count(*)/checksum states - Added in_trans as counter if table is used by running transactions - Split concurrent_insert into lock_key_trees and on_transactional_concurrent_insert. - Added virtual function lock_restore_status Changes to MARIA_HA: - save_state -> state_save - Added state_start to store state at start of transaction storage/maria/maria_pack.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state Indentation fixes storage/maria/trnman.c: Added hook 'trnnam_end_trans_hook' that is called when transaction ends Added trn->used_tables that is used to an entry for all tables used by transaction More DBUG Changed return type of trnman_end_trn() to my_bool Added trnman_get_min_trid() to get minimum trid in use. Added trnman_exists_active_transactions() to check if there exist a running transaction started between two commit id storage/maria/trnman.h: Added 'used_tables' Moved all pointers into same groups to get better memory alignment storage/maria/trnman_public.h: Added prototypes for new functions and variables Chagned return type of trnman_end_trn() to my_bool storage/myisam/ha_myisam.cc: Added argument to end_bulk_insert() if operation should be aborted storage/myisam/ha_myisam.h: Added argument to end_bulk_insert() if operation should be aborted storage/maria/ma_state.c: Functions to handle state of count(*) and checksum storage/maria/ma_state.h: Structures and declarations to handle state of count(*) and checksum
2008-05-29 17:33:33 +02:00
/*
Copy state information that where updated while the table was used
in not transactional mode
*/
_ma_copy_nontrans_state_information(info);
_ma_reset_history(info->s);
WL#3138: Maria - fast "SELECT COUNT(*) FROM t;" and "CHECKSUM TABLE t" Added argument to maria_end_bulk_insert() to know if the table will be deleted after the operation Fixed wrong call to strmake Don't call bulk insert in case of inserting only one row (speed optimization as starting/stopping bulk insert Allow storing year 2155 in year field When running with purify/valgrind avoid copying structures over themself Added hook 'trnnam_end_trans_hook' that is called when transaction ends Added trn->used_tables that is used to an entry for all tables used by transaction Fixed that ndb doesn't crash on duplicate key error when start_bulk_insert/end_bulk_insert are not called include/maria.h: Added argument to maria_end_bulk_insert() to know if the table will be deleted after the operation include/my_tree.h: Added macro 'reset_free_element()' to be able to ignore calls to the external free function. Is used to optimize end-bulk-insert in case of failures, in which case we don't want write the remaining keys in the tree mysql-test/install_test_db.sh: Upgrade to new mysql_install_db options mysql-test/r/maria-mvcc.result: New tests mysql-test/r/maria.result: New tests mysql-test/suite/ndb/r/ndb_auto_increment.result: Fixed error message now when bulk insert is not always called mysql-test/suite/ndb/t/ndb_auto_increment.test: Fixed error message now when bulk insert is not always called mysql-test/t/maria-mvcc.test: Added testing of versioning of count(*) mysql-test/t/maria-page-checksum.test: Added comment mysql-test/t/maria.test: More tests mysys/hash.c: Code style change sql/field.cc: Allow storing year 2155 in year field sql/ha_ndbcluster.cc: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored sql/ha_ndbcluster.h: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored sql/ha_partition.cc: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored sql/ha_partition.h: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored sql/handler.cc: Don't call get_dup_key() if there is no table object. This can happen if the handler generates a duplicate key error on commit sql/handler.h: Added new argument to end_bulk_insert() to signal if the bulk insert should ignored (ie, the table will be deleted) sql/item.cc: Style fix Removed compiler warning sql/log_event.cc: Added new argument to ha_end_bulk_insert() sql/log_event_old.cc: Added new argument to ha_end_bulk_insert() sql/mysqld.cc: Removed compiler warning sql/protocol.cc: Added DBUG sql/sql_class.cc: Added DBUG Fixed wrong call to strmake sql/sql_insert.cc: Don't call bulk insert in case of inserting only one row (speed optimization as starting/stopping bulk insert involves a lot of if's) Added new argument to ha_end_bulk_insert() sql/sql_load.cc: Added new argument to ha_end_bulk_insert() sql/sql_parse.cc: Style fixes Avoid goto in common senario sql/sql_select.cc: When running with purify/valgrind avoid copying structures over themself. This is not a real bug in itself, but it's a waste of cycles and causes valgrind warnings sql/sql_select.h: Avoid copying structures over themself. This is not a real bug in itself, but it's a waste of cycles and causes valgrind warnings sql/sql_table.cc: Call HA_EXTRA_PREPARE_FOR_DROP if table created by ALTER TABLE is going to be dropped Added new argument to ha_end_bulk_insert() storage/archive/ha_archive.cc: Added new argument to end_bulk_insert() storage/archive/ha_archive.h: Added new argument to end_bulk_insert() storage/federated/ha_federated.cc: Added new argument to end_bulk_insert() storage/federated/ha_federated.h: Added new argument to end_bulk_insert() storage/maria/Makefile.am: Added ma_state.c and ma_state.h storage/maria/ha_maria.cc: Versioning of count(*) and checksum - share->state.state is now assumed to be correct, not handler->state - Call _ma_setup_live_state() in external lock to get count(*)/checksum versioning. In case of not versioned and not concurrent insertable table, file->s->state.state contains the correct state information Other things: - file->s -> share - Added DBUG_ASSERT() for unlikely case - Optimized end_bulk_insert() to not write anything if table is going to be deleted (as in failed alter table) - Indentation changes in external_lock becasue of removed 'goto' caused a big conflict even if very little was changed storage/maria/ha_maria.h: New argument to end_bulk_insert() storage/maria/ma_blockrec.c: Update for versioning of count(*) and checksum Keep share->state.state.data_file_length up to date (not info->state->data_file_length) Moved _ma_block_xxxx_status() and maria_versioning() functions to ma_state.c storage/maria/ma_check.c: Update and use share->state.state instead of info->state info->s to share Update info->state at end of repair Call _ma_reset_state() to update share->state_history at end of repair storage/maria/ma_checkpoint.c: Call _ma_remove_not_visible_states() on checkpoint to clean up not visible state history from tables storage/maria/ma_close.c: Remember state history for running transaction even if table is closed storage/maria/ma_commit.c: Ensure we always call trnman_commit_trn() even if other calls fails. If we don't do that, the translog and state structures will not be freed storage/maria/ma_delete.c: Versioning of count(*) and checksum: - Always update info->state->checksum and info->state->records storage/maria/ma_delete_all.c: Versioning of count(*) and checksum: - Ensure that share->state.state is updated, as here is where we store the primary information storage/maria/ma_dynrec.c: Use lock_key_trees instead of concurrent_insert to check if trees should be locked. This allows us to lock trees both for concurrent_insert and for index versioning. storage/maria/ma_extra.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state - share->concurrent_insert -> share->non_transactional_concurrent_insert - Don't update share->state.state from info->state if transactional table Optimization: - Don't flush io_cache or bitmap if we are using FLUSH_IGNORE_CHANGED storage/maria/ma_info.c: Get most state information from current state storage/maria/ma_init.c: Add hash table and free function to store states for closed tables Install hook for transaction commit/rollback to update history state storage/maria/ma_key_recover.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state storage/maria/ma_locking.c: Versioning of count(*) and checksum: - Call virtual functions (if exists) to restore/update status - Move _ma_xxx_status() functions to ma_state.c info->s -> share storage/maria/ma_open.c: Versioning of count(*) and checksum: - For not transactional tables, set info->state to point to new allocated state structure. - Initialize new info->state_start variable that points to state at start of transaction - Copy old history states from hash table (maria_stored_states) first time the table is opened - Split flag share->concurrent_insert to non_transactional_concurrent_insert & lock_key_tree - For now, only enable versioning of tables without keys (to be fixed in soon!) - Added new virtual function to restore status in maria_lock_database) More DBUG storage/maria/ma_page.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state - Modify share->state.state.key_file_length under share->intern_lock storage/maria/ma_range.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees info->s -> share storage/maria/ma_recovery.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state - Update state information on close and when reenabling logging storage/maria/ma_rkey.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees storage/maria/ma_rnext.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees storage/maria/ma_rnext_same.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees - Only skip rows based on file length if non_transactional_concurrent_insert is set storage/maria/ma_rprev.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees storage/maria/ma_rsame.c: Versioning of count(*) and checksum: - Lock trees based on share->lock_key_trees storage/maria/ma_sort.c: Use share->state.state instead of info->state Fixed indentation storage/maria/ma_static.c: Added maria_stored_state storage/maria/ma_update.c: Versioning of count(*) and checksum: - Always update info->state->checksum and info->state->records - Remove optimization for index file update as it doesn't work for transactional tables storage/maria/ma_write.c: Versioning of count(*) and checksum: - Always update info->state->checksum and info->state->records storage/maria/maria_def.h: Move MARIA_STATUS_INFO to ma_state.h Changes to MARIA_SHARE: - Added state_history to store count(*)/checksum states - Added in_trans as counter if table is used by running transactions - Split concurrent_insert into lock_key_trees and on_transactional_concurrent_insert. - Added virtual function lock_restore_status Changes to MARIA_HA: - save_state -> state_save - Added state_start to store state at start of transaction storage/maria/maria_pack.c: Versioning of count(*) and checksum: - Use share->state.state instead of info->state Indentation fixes storage/maria/trnman.c: Added hook 'trnnam_end_trans_hook' that is called when transaction ends Added trn->used_tables that is used to an entry for all tables used by transaction More DBUG Changed return type of trnman_end_trn() to my_bool Added trnman_get_min_trid() to get minimum trid in use. Added trnman_exists_active_transactions() to check if there exist a running transaction started between two commit id storage/maria/trnman.h: Added 'used_tables' Moved all pointers into same groups to get better memory alignment storage/maria/trnman_public.h: Added prototypes for new functions and variables Chagned return type of trnman_end_trn() to my_bool storage/myisam/ha_myisam.cc: Added argument to end_bulk_insert() if operation should be aborted storage/myisam/ha_myisam.h: Added argument to end_bulk_insert() if operation should be aborted storage/maria/ma_state.c: Functions to handle state of count(*) and checksum storage/maria/ma_state.h: Structures and declarations to handle state of count(*) and checksum
2008-05-29 17:33:33 +02:00
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
if (flush_pages)
{
/*
We are going to change callbacks; if a page is flushed at this moment
this can cause race conditions, that's one reason to flush pages
Fix for BUG#39363 "Concurent inserts in the same table lead to hang in maria engine" (need a mutex when modifying bitmap->non_flushable), which I hit when running maria_bulk_insert.yy. After fixing this, I hit an assertion in check_and_set_lsn() saying that the page was PAGECACHE_PLAIN_PAGE. This could be caused by pages left by an operation which had transactions disabled (like a bulk insert with repair): in this patch we remove those pages out of the cache when we re-enable transactions. After fixing this, I get page cache deadlocks, pushbuild2 also has some, to be looked at. No testcase, requires concurrency and running for 15 minutes, but automatically tested by pushbuild2. storage/maria/ma_bitmap.c: Doing bitmap->non_flushable++ without mutex was wrong. If this ++ happened while another ++ or -- was happening in another thread, one ++ or -- could be missed and the bitmap code would behave wrongly. For example, if a ++ was missed, the DBUG_ASSERT(((int) (bitmap->non_flushable)) >= 0) in _ma_bitmap_release_unused() could fire. I saw this assertion happen in practice in maria_bulk_insert.yy. Adding this mutex lock eliminated the assertion problem. The >=0 was wrong, should be >0 (or the variable could go negative). storage/maria/ma_recovery.c: When we re-enable transactionality, as we may have created pages of type PAGECACHE_PLAIN_PAGE before, we need to remove them from the cache (FLUSH_RELEASE). Or they would stay this way, and later when we maria_write() to them, we would try to tag them with a LSN (ma_unpin_all_pages()), which is incorrect for a plain page (and causes assertion in the page cache at start of check_and_set_lsn()). I saw the assertion fire with maria_bulk_insert.yy, and this seems to cure it. page cache
2008-10-17 15:37:07 +02:00
now. Other reasons: a checkpoint could be running and miss pages; the
pages have type PAGECACHE_PLAIN_PAGE which should not remain. As
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
there are no REDOs for pages, them, bitmaps and the state also have to
Fix for BUG#39363 "Concurent inserts in the same table lead to hang in maria engine" (need a mutex when modifying bitmap->non_flushable), which I hit when running maria_bulk_insert.yy. After fixing this, I hit an assertion in check_and_set_lsn() saying that the page was PAGECACHE_PLAIN_PAGE. This could be caused by pages left by an operation which had transactions disabled (like a bulk insert with repair): in this patch we remove those pages out of the cache when we re-enable transactions. After fixing this, I get page cache deadlocks, pushbuild2 also has some, to be looked at. No testcase, requires concurrency and running for 15 minutes, but automatically tested by pushbuild2. storage/maria/ma_bitmap.c: Doing bitmap->non_flushable++ without mutex was wrong. If this ++ happened while another ++ or -- was happening in another thread, one ++ or -- could be missed and the bitmap code would behave wrongly. For example, if a ++ was missed, the DBUG_ASSERT(((int) (bitmap->non_flushable)) >= 0) in _ma_bitmap_release_unused() could fire. I saw this assertion happen in practice in maria_bulk_insert.yy. Adding this mutex lock eliminated the assertion problem. The >=0 was wrong, should be >0 (or the variable could go negative). storage/maria/ma_recovery.c: When we re-enable transactionality, as we may have created pages of type PAGECACHE_PLAIN_PAGE before, we need to remove them from the cache (FLUSH_RELEASE). Or they would stay this way, and later when we maria_write() to them, we would try to tag them with a LSN (ma_unpin_all_pages()), which is incorrect for a plain page (and causes assertion in the page cache at start of check_and_set_lsn()). I saw the assertion fire with maria_bulk_insert.yy, and this seems to cure it. page cache
2008-10-17 15:37:07 +02:00
be flushed and synced.
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
*/
if (_ma_flush_table_files(info, MARIA_FLUSH_DATA | MARIA_FLUSH_INDEX,
Fix for BUG#39363 "Concurent inserts in the same table lead to hang in maria engine" (need a mutex when modifying bitmap->non_flushable), which I hit when running maria_bulk_insert.yy. After fixing this, I hit an assertion in check_and_set_lsn() saying that the page was PAGECACHE_PLAIN_PAGE. This could be caused by pages left by an operation which had transactions disabled (like a bulk insert with repair): in this patch we remove those pages out of the cache when we re-enable transactions. After fixing this, I get page cache deadlocks, pushbuild2 also has some, to be looked at. No testcase, requires concurrency and running for 15 minutes, but automatically tested by pushbuild2. storage/maria/ma_bitmap.c: Doing bitmap->non_flushable++ without mutex was wrong. If this ++ happened while another ++ or -- was happening in another thread, one ++ or -- could be missed and the bitmap code would behave wrongly. For example, if a ++ was missed, the DBUG_ASSERT(((int) (bitmap->non_flushable)) >= 0) in _ma_bitmap_release_unused() could fire. I saw this assertion happen in practice in maria_bulk_insert.yy. Adding this mutex lock eliminated the assertion problem. The >=0 was wrong, should be >0 (or the variable could go negative). storage/maria/ma_recovery.c: When we re-enable transactionality, as we may have created pages of type PAGECACHE_PLAIN_PAGE before, we need to remove them from the cache (FLUSH_RELEASE). Or they would stay this way, and later when we maria_write() to them, we would try to tag them with a LSN (ma_unpin_all_pages()), which is incorrect for a plain page (and causes assertion in the page cache at start of check_and_set_lsn()). I saw the assertion fire with maria_bulk_insert.yy, and this seems to cure it. page cache
2008-10-17 15:37:07 +02:00
FLUSH_RELEASE, FLUSH_RELEASE) ||
_ma_state_info_write(share,
MA_STATE_INFO_WRITE_DONT_MOVE_OFFSET |
MA_STATE_INFO_WRITE_LOCK) ||
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
_ma_sync_table_files(info))
DBUG_RETURN(1);
}
else if (!maria_in_recovery)
{
/*
Except in Recovery, we mustn't leave dirty pages (see comments above).
Note that this does not verify that the state was flushed, but hey.
*/
pagecache_file_no_dirty_page(share->pagecache, &info->dfile);
pagecache_file_no_dirty_page(share->pagecache, &share->kfile);
}
_ma_set_data_pagecache_callbacks(&info->dfile, share);
_ma_set_index_pagecache_callbacks(&share->kfile, share);
_ma_bitmap_set_pagecache_callbacks(&share->bitmap.file, share);
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
/*
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
info->trn was not changed in the disable/enable combo, so that it's
still usable in this kind of combination:
external_lock;
start_bulk_insert; # table is empty, disables logging
end_bulk_insert; # enables logging
start_bulk_insert; # table is not empty, logging stays
# so rows insertion needs the real trn.
as happens during row-based replication on the slave.
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
*/
}
- fix for segfault in rpl_trigger/rpl_found_rows with default engine=maria (fix is keeping the real TRN through a disable_logging/reenable cycle) - fix for pagecache assertion failure in ps/type_ranges with default engine=maria (fix is in sql_insert.cc) - when reenabling logging we must either flush all dirty pages, or at least verify (in debug build) that there are none. For example a bulk insert with single UNDO_BULK_INSERT must flush them, no matter if it uses repair or not (bugfix) - UNDO_BULK_INSERT_WITH_REPAIR is also used with repair, changes name mysql-test/r/maria.result: tests for bugs fixed mysql-test/t/maria.test: tests for bugs fixed sql/sql_insert.cc: Bugfix: even if select_create::prepare() failed to create the 'table' object we still have to re-enable logging. storage/maria/ha_maria.cc: Bugfix: when a transactional table does a bulk insert without repair, it still sometimes skips logging of REDOs thus needs a full flush and sync at the end. Not if repair is done, as repair does it internally already (see end of maria_repair*()). storage/maria/ha_maria.h: variable now can have 3 states not 2 storage/maria/ma_bitmap.c: name change storage/maria/ma_blockrec.c: name change storage/maria/ma_blockrec.h: name change storage/maria/ma_check.c: * When maria_repair() re-enables logging it does not need to ask for a flush&sync as it did it by itself already a few lines before. * the log record of bulk insert can be used even without repair * disable logging in maria_zerofill(): without that, it puts LSN pages in the cache, so when it flushes them it flushes the log; the change makes auto-ha_maria::zerofill-if-moved faster (no log flush). storage/maria/ma_key_recover.c: name change storage/maria/ma_loghandler.c: name change storage/maria/ma_loghandler.h: name change storage/maria/ma_pagecache.c: A function, to check in debug builds that no dirty pages exist for a file. storage/maria/ma_pagecache.h: new function (nothing in non-debug) storage/maria/ma_recovery.c: _ma_tmp_disable_logging() sets info->trn to dummy_transaction_object when needed now. The changes done here about info->trn are to allow a table to retain its original, real TRN through a disable/reenable cycle (see replication scenario in _ma_reenable_logging_for_table()). When we reenable, we offer the caller to flush and sync the table; if the caller doesn't accept our offer, we verify that it's ok (no REDOs => no dirty pages are allowed to exist). storage/maria/maria_chk.c: comment storage/maria/maria_def.h: new names mysql-test/suite/rpl/r/rpl_stm_maria.result: result (it used to crash) mysql-test/suite/rpl/t/rpl_stm_maria.test: Test of replication-specific Maria bug fixed
2008-01-20 05:25:26 +01:00
DBUG_RETURN(0);
WL#3072 - Maria recovery maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem (=ALTER TABLE or CREATE SELECT, which both disable logging of REDO_INSERT*). For that, when ha_maria::external_lock() disables transactionality it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a" picks up to write a warning. REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log executes LOGREC_REDO_REPAIR no warning is needed. storage/maria/ha_maria.cc: as we now log a record when disabling transactionility, we need the TRN to be set up first storage/maria/ma_blockrec.c: comment storage/maria/ma_loghandler.c: new type of log record storage/maria/ma_loghandler.h: new type of log record storage/maria/ma_recovery.c: * maria_apply_log() now returns a count of warnings. What currently produces warnings is: - skipping applying UNDOs though there are some (=> inconsistent table) - replaying log (in maria_read_log) though the log contains some ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those and is so incomplete). Count of warnings affects the final message of maria_read_log and recovery (though in recovery none of the two conditions above should happen). * maria_read_log used to always print a warning message at startup to say it is unsafe if ALTER TABLE was used. Now it prints it only if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT was used (both disable logging of REDO_INSERT* as those records are not needed for recovery; those missing records in turn make recreation-from-scratch, via maria_read_log, impossible). For that, when ha_maria::external_lock() disables transactionality, _ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to the log, which maria_apply_log() picks up to write a warning. storage/maria/ma_recovery.h: maria_apply_log() returns a count of warnings storage/maria/maria_def.h: _ma_tmp_disable_logging_for_table() grows so becomes a function storage/maria/maria_read_log.c: maria_apply_log can now return a count of warnings, to temper the "SUCCESS" message printed in the end by maria_read_log. Advise users to make a backup first.
2007-11-14 12:51:16 +01:00
}
WL#3072 - Maria Recovery * to honour WAL we now force the whole log when flushing a bitmap page. * ability to intentionally crash in various places for recovery testing * bugfix (dirty pages list found in checkpoint record was ignored) * smaller checkpoint record * misc small cleanups and comments mysql-test/include/maria_empty_logs.inc: maria-purge.test creates ~11 logs, remove them all mysql-test/r/maria-recovery-bitmap.result: result is good; without the _ma_bitmap_get_log_address() call, we got check error Bitmap at 0 has pages reserved outside of data file length mysql-test/r/maria-recovery.result: result update mysql-test/t/maria-recovery-bitmap.test: enable test of "bitmap-flush should flush whole log otherwise corrupted data file (bitmap ahead of data pages)". mysql-test/t/maria-recovery.test: test of checkpoint sql/sql_table.cc: comment storage/maria/ha_maria.cc: _ma_reenable_logging_for_table() now includes file->trn=0. At the end of repair() we don't need to re-enable logging, it is done already by caller (like copy_data_between_tables()); it sounds strange that this function could decide to re-enable, it should be up to caller who knows what other operations it plans. Removing this line led to assertion failure in maria_lock_database(F_UNLCK), fixed by removing the assertion: maria_lock_database() is here called in a context where F_UNLCK does not make the table visible to others so assertion is excessive, and external_lock() is already designed to honour the asserted condition. Ability to crash at the end of bulk insert when indices have been enabled. storage/maria/ma_bitmap.c: Better use pagecache_file_init() than set pagecache callbacks directly; and a new function to set those callbacks for bitmap so that we can reuse it. _ma_bitmap_get_log_address() is a pagecache get_log_address callback which causes the whole log to be flushed when a bitmap page is flushed by the page cache. This was required by WAL. storage/maria/ma_blockrec.c: get_log_address pagecache callback for data (non bitmap) pages: just reads the LSN from the page's content, like was hard-coded before in ma_pagecache.c. storage/maria/ma_blockrec.h: functions which need to be exported storage/maria/ma_check.c: create_new_data_handle() can be static. Ability to crash after rebuilding the index in OPTIMIZE, in REPAIR. my_lock() implemented already. storage/maria/ma_checkpoint.c: As MARIA_SHARE* is now accessible to pagecache_collect_changed_blocks_LSN(), we don't need to store kfile/dfile descriptors in checkpoint record, 2-byte-id of the table plus one byte to say if this is data or index file is enough. So we go from 4+4 bytes per table down to 2+1. storage/maria/ma_commit.c: removing duplicate functions (see _ma_tmp_disable_logging_for_table()) storage/maria/ma_extra.c: Monty fixed storage/maria/ma_key_recover.c: comment storage/maria/ma_locking.c: Sometimes other code does funny things with maria_lock_database(), like ha_maria::repair() calling it at start and end without going through ha_maria::external_lock(). So it happens that maria_lock_database() is called with now_transactional!=born_transactional. storage/maria/ma_loghandler.c: update to new prototype storage/maria/ma_open.c: set_data|index_pagecache_callbacks() need to be exported as they are now called when disabling/enabling transactionality. storage/maria/ma_pagecache.c: Removing PAGE_LSN_OFFSET, as much of the code relies on it being 0 anyway (let's not give impression we can just change this constant). When flushing a page to disk, call the get_log_address callback to know up to which LSN the log should be flushed. As we now can access MARIA_SHARE* we can know share->id and store it into the checkpoint record; we thus go from 4 bytes per dirty page to 2+1. storage/maria/ma_pagecache.h: get_log_address callback storage/maria/ma_panic.c: No reason to reset pagecache callbacks in HA_PANIC_READ: all we do is reopen files if they were closed; callbacks should be in place already as 'info' exists; we just want to modify the file descriptors, not the full PAGECACHE_FILE structure. If we open data file and it was closed, share->bitmap.file needs to be set. Note that the modified code is disabled anyway. storage/maria/ma_recovery.c: Checkpoint record does not contain kfile/dfile descriptors anymore so code can be simplified. Hash key in all_dirty_pages is not made from file_descriptor & pageno anymore, but index_or_data & table-short-id & pageno. If a table's create_rename_lsn is higher than record's LSN, we skip the table and don't fail if it's corrupted (because the LSNs say that we don't have to look at this table). If a table is skipped (for example due to create_rename_lsn), its UNDOs still cause undo_lsn to advance; this is so that if later we notice the transaction has to rollback we fail (as table should not be skipped in this case). Fixing a bug: the dirty_pages list was never used, because the LSN below which it was used was the minimum rec_lsn of dirty pages! It is now the min(checkpoint_start_log_horizon, min(trn's rec_lsn)). When we disable/reenable transactionality, we modify pagecache callbacks (needed for example for get_log_address: changing share->page_type is not enough anymore). storage/maria/ma_write.c: 'records' and 'checksum' are protected: they are updated under log's mutex in write-hooks when UNDO is written. storage/maria/maria_chk.c: remove use of duplicate functions. storage/maria/maria_def.h: set_data|index_pagecache_callbacks() need to be exported; _ma_reenable_logging_for_table() changes to a real function. storage/maria/unittest/ma_pagecache_consist.c: new prototype storage/maria/unittest/ma_pagecache_single.c: new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype
2007-12-30 21:32:07 +01:00
static void print_redo_phase_progress(TRANSLOG_ADDRESS addr)
{
Added --with-maria-tmp-tables (default one) to allow on to configure if Maria should be used for internal temporary tables Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables Fixed bug that caused update of big blobs to crash Use pagecache_page_no_t as type for pages (to get rid of compiler warnings) Added cast to get rid of compiler warning Fixed wrong types of variables and arguments that caused lost information Fixed wrong DBUG_ASSERT() that caused REDO of big blobs to fail Removed some historical ifdefs that caused problem with windows compilations BUILD/SETUP.sh: Added --with-maria-tmp-tables include/maria.h: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option include/my_base.h: Added comment mysql-test/r/maria-big.result: Added test that uses big blobs mysql-test/t/maria-big.test: Added test that uses big blobs sql/mysqld.cc: Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables sql/sql_class.h: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined sql/sql_select.cc: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined storage/maria/ha_maria.cc: Fixed compiler warnings reported by MCC - Fixed usage of wrong types that caused data loss - Changed parameter for rep_quick to my_bool - Added safe casts Fixed indentation storage/maria/ma_bitmap.c: Use pagecache_page_no_t as type for pages Fixed compiler warnings Fixed bug that caused update of big blobs to crash storage/maria/ma_blockrec.c: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option Fixed compiler warnings Fixed wrong DBUG_ASSERT() storage/maria/ma_blockrec.h: Use pagecache_page_no_t as type for pages storage/maria/ma_check.c: Fixed some wrong parameters where we didn't get all bits for test_flag Changed rep_quick to be of type my_bool Use pagecache_page_no_t as type for pages Added cast's to get rid of compiler warnings Changed type of record_pos to get rid of compiler warning storage/maria/ma_create.c: Added safe cast's to get rid of compiler warnings storage/maria/ma_dynrec.c: Fixed usage of wrong type storage/maria/ma_key.c: Fixed compiler warning storage/maria/ma_key_recover.c: Use pagecache_page_no_t as type for pages storage/maria/ma_loghandler_lsn.h: Added cast's to get rid of compiler warnings storage/maria/ma_page.c: Changed variable name from 'page' to 'pos' as it was an offset and not a page address Moved page_size inside block to get rid of compiler warning storage/maria/ma_pagecache.c: Fixed compiler warnings Replaced compile time assert with TODO storage/maria/ma_pagecache.h: Use pagecache_page_no_t as type for pages storage/maria/ma_pagecrc.c: Allow bitmap pages that is all zero storage/maria/ma_preload.c: Added cast to get rid of compiler warning storage/maria/ma_recovery.c: Changed types to get rid of compiler warnings Use bool for quick_repair to get rid of compiler warning Fixed some variables that was wrongly declared (not enough precission) Added cast to get rid of compiler warning storage/maria/ma_test2.c: Remove historical undefs storage/maria/maria_chk.c: Changed rep_quick to bool Fixed wrong parameter to maria_chk_data_link() storage/maria/maria_def.h: Use pagecache_page_no_t as type for pages storage/maria/maria_pack.c: Renamed isam -> maria storage/maria/plug.in: Added option --with-maria-tmp-tables storage/maria/trnman.c: Added cast to get rid of compiler warning storage/myisam/mi_test2.c: Remove historical undefs
2008-01-10 20:21:36 +01:00
static uint end_logno= FILENO_IMPOSSIBLE, percentage_printed= 0;
static ulong end_offset;
2008-02-21 01:45:02 +01:00
static ulonglong initial_remainder= ~(ulonglong) 0;
Added --with-maria-tmp-tables (default one) to allow on to configure if Maria should be used for internal temporary tables Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables Fixed bug that caused update of big blobs to crash Use pagecache_page_no_t as type for pages (to get rid of compiler warnings) Added cast to get rid of compiler warning Fixed wrong types of variables and arguments that caused lost information Fixed wrong DBUG_ASSERT() that caused REDO of big blobs to fail Removed some historical ifdefs that caused problem with windows compilations BUILD/SETUP.sh: Added --with-maria-tmp-tables include/maria.h: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option include/my_base.h: Added comment mysql-test/r/maria-big.result: Added test that uses big blobs mysql-test/t/maria-big.test: Added test that uses big blobs sql/mysqld.cc: Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables sql/sql_class.h: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined sql/sql_select.cc: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined storage/maria/ha_maria.cc: Fixed compiler warnings reported by MCC - Fixed usage of wrong types that caused data loss - Changed parameter for rep_quick to my_bool - Added safe casts Fixed indentation storage/maria/ma_bitmap.c: Use pagecache_page_no_t as type for pages Fixed compiler warnings Fixed bug that caused update of big blobs to crash storage/maria/ma_blockrec.c: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option Fixed compiler warnings Fixed wrong DBUG_ASSERT() storage/maria/ma_blockrec.h: Use pagecache_page_no_t as type for pages storage/maria/ma_check.c: Fixed some wrong parameters where we didn't get all bits for test_flag Changed rep_quick to be of type my_bool Use pagecache_page_no_t as type for pages Added cast's to get rid of compiler warnings Changed type of record_pos to get rid of compiler warning storage/maria/ma_create.c: Added safe cast's to get rid of compiler warnings storage/maria/ma_dynrec.c: Fixed usage of wrong type storage/maria/ma_key.c: Fixed compiler warning storage/maria/ma_key_recover.c: Use pagecache_page_no_t as type for pages storage/maria/ma_loghandler_lsn.h: Added cast's to get rid of compiler warnings storage/maria/ma_page.c: Changed variable name from 'page' to 'pos' as it was an offset and not a page address Moved page_size inside block to get rid of compiler warning storage/maria/ma_pagecache.c: Fixed compiler warnings Replaced compile time assert with TODO storage/maria/ma_pagecache.h: Use pagecache_page_no_t as type for pages storage/maria/ma_pagecrc.c: Allow bitmap pages that is all zero storage/maria/ma_preload.c: Added cast to get rid of compiler warning storage/maria/ma_recovery.c: Changed types to get rid of compiler warnings Use bool for quick_repair to get rid of compiler warning Fixed some variables that was wrongly declared (not enough precission) Added cast to get rid of compiler warning storage/maria/ma_test2.c: Remove historical undefs storage/maria/maria_chk.c: Changed rep_quick to bool Fixed wrong parameter to maria_chk_data_link() storage/maria/maria_def.h: Use pagecache_page_no_t as type for pages storage/maria/maria_pack.c: Renamed isam -> maria storage/maria/plug.in: Added option --with-maria-tmp-tables storage/maria/trnman.c: Added cast to get rid of compiler warning storage/myisam/mi_test2.c: Remove historical undefs
2008-01-10 20:21:36 +01:00
uint cur_logno;
ulong cur_offset;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
ulonglong local_remainder;
Added --with-maria-tmp-tables (default one) to allow on to configure if Maria should be used for internal temporary tables Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables Fixed bug that caused update of big blobs to crash Use pagecache_page_no_t as type for pages (to get rid of compiler warnings) Added cast to get rid of compiler warning Fixed wrong types of variables and arguments that caused lost information Fixed wrong DBUG_ASSERT() that caused REDO of big blobs to fail Removed some historical ifdefs that caused problem with windows compilations BUILD/SETUP.sh: Added --with-maria-tmp-tables include/maria.h: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option include/my_base.h: Added comment mysql-test/r/maria-big.result: Added test that uses big blobs mysql-test/t/maria-big.test: Added test that uses big blobs sql/mysqld.cc: Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables sql/sql_class.h: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined sql/sql_select.cc: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined storage/maria/ha_maria.cc: Fixed compiler warnings reported by MCC - Fixed usage of wrong types that caused data loss - Changed parameter for rep_quick to my_bool - Added safe casts Fixed indentation storage/maria/ma_bitmap.c: Use pagecache_page_no_t as type for pages Fixed compiler warnings Fixed bug that caused update of big blobs to crash storage/maria/ma_blockrec.c: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option Fixed compiler warnings Fixed wrong DBUG_ASSERT() storage/maria/ma_blockrec.h: Use pagecache_page_no_t as type for pages storage/maria/ma_check.c: Fixed some wrong parameters where we didn't get all bits for test_flag Changed rep_quick to be of type my_bool Use pagecache_page_no_t as type for pages Added cast's to get rid of compiler warnings Changed type of record_pos to get rid of compiler warning storage/maria/ma_create.c: Added safe cast's to get rid of compiler warnings storage/maria/ma_dynrec.c: Fixed usage of wrong type storage/maria/ma_key.c: Fixed compiler warning storage/maria/ma_key_recover.c: Use pagecache_page_no_t as type for pages storage/maria/ma_loghandler_lsn.h: Added cast's to get rid of compiler warnings storage/maria/ma_page.c: Changed variable name from 'page' to 'pos' as it was an offset and not a page address Moved page_size inside block to get rid of compiler warning storage/maria/ma_pagecache.c: Fixed compiler warnings Replaced compile time assert with TODO storage/maria/ma_pagecache.h: Use pagecache_page_no_t as type for pages storage/maria/ma_pagecrc.c: Allow bitmap pages that is all zero storage/maria/ma_preload.c: Added cast to get rid of compiler warning storage/maria/ma_recovery.c: Changed types to get rid of compiler warnings Use bool for quick_repair to get rid of compiler warning Fixed some variables that was wrongly declared (not enough precission) Added cast to get rid of compiler warning storage/maria/ma_test2.c: Remove historical undefs storage/maria/maria_chk.c: Changed rep_quick to bool Fixed wrong parameter to maria_chk_data_link() storage/maria/maria_def.h: Use pagecache_page_no_t as type for pages storage/maria/maria_pack.c: Renamed isam -> maria storage/maria/plug.in: Added option --with-maria-tmp-tables storage/maria/trnman.c: Added cast to get rid of compiler warning storage/myisam/mi_test2.c: Remove historical undefs
2008-01-10 20:21:36 +01:00
uint percentage_done;
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
if (tracef == stdout)
return;
if (recovery_message_printed == REC_MSG_NONE)
{
WL#3071 Maria checkpoint, WL#3072 Maria recovery instead of fprintf(stderr) when a task (with no user connected) gets an error, use my_printf_error(). Flags ME_JUST_WARNING and ME_JUST_INFO added to my_error()/my_printf_error(), which pass it to my_message_sql() which is modified to call the appropriate sql_print_*(). This way recovery can signal its start and end with [Note] and not [ERROR] (but failure with [ERROR]). Recovery's detailed progress (percents etc) still uses stderr as they have to stay on one single line. sql_print_error() changed to use my_progname_short (nicer display). mysql-test-run.pl --gdb/--ddd does not run mysqld, because a breakpoint in mysql_parse is too late to debug startup problems; instead, dev should set the breakpoints it wants and then "run" ("r"). include/my_sys.h: new flags to tell error_handler_hook that this is not an error but an information or warning mysql-test/mysql-test-run.pl: when running with --gdb/--ddd to debug mysqld, breaking at mysql_parse is too late to debug startup problems; now, it does not run mysqld, does not set breakpoints, developer can set as early breakpoints as it wants and is responsible for typing "run" (or "r") mysys/my_init.c: set my_progname_short mysys/my_static.c: my_progname_short added sql/mysqld.cc: * my_message_sql() can now receive info or warning, not only error; this allows mysys to tell the user (or the error log if no user) about an info or warning. Used from Maria. * plugins (or engines like Maria) may want to call my_error(), so set up the error handler hook (my_message_sql) before initializing plugins; otherwise they get my_message_no_curses which is less integrated into mysqld (is just fputs()) * using my_progname_short instead of my_progname, in my_message_sql() (less space on screen) storage/maria/ma_checkpoint.c: fprintf(stderr) -> ma_message_no_user() storage/maria/ma_checkpoint.h: function for any Maria task, not connected to a user (example: checkpoint, recovery; soon could be deleted records purger) to report a message (calls my_printf_error() which, when inside ha_maria, leads to sql_print_*(), and when outside, leads to my_message_no_curses i.e. stderr). storage/maria/ma_recovery.c: To tell that recovery starts and ends we use ma_message_no_user() (sql_print_*() in practice). Detailed progress info still uses stderr as sql_print() cannot put several messages on one line. 071116 18:42:16 [Note] mysqld: Maria engine: starting recovery recovered pages: 0% 67% 100% (0.0 seconds); transactions to roll back: 1 0 (0.0 seconds); tables to flush: 1 0 (0.0 seconds); 071116 18:42:16 [Note] mysqld: Maria engine: recovery done storage/maria/maria_chk.c: my_progname_short moved to mysys storage/maria/maria_read_log.c: my_progname_short moved to mysys storage/myisam/myisamchk.c: my_progname_short moved to mysys
2007-11-16 17:09:51 +01:00
print_preamble();
fprintf(stderr, "recovered pages: 0%%");
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
fflush(stderr);
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 1;
recovery_message_printed= REC_MSG_REDO;
}
if (end_logno == FILENO_IMPOSSIBLE)
{
LSN end_addr= translog_get_horizon();
end_logno= LSN_FILE_NO(end_addr);
end_offset= LSN_OFFSET(end_addr);
}
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
cur_logno= LSN_FILE_NO(addr);
cur_offset= LSN_OFFSET(addr);
local_remainder= (cur_logno == end_logno) ? (end_offset - cur_offset) :
(((longlong)log_file_size) - cur_offset +
max(end_logno - cur_logno - 1, 0) * ((longlong)log_file_size) +
end_offset);
if (initial_remainder == (ulonglong)(-1))
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
initial_remainder= local_remainder;
Added --with-maria-tmp-tables (default one) to allow on to configure if Maria should be used for internal temporary tables Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables Fixed bug that caused update of big blobs to crash Use pagecache_page_no_t as type for pages (to get rid of compiler warnings) Added cast to get rid of compiler warning Fixed wrong types of variables and arguments that caused lost information Fixed wrong DBUG_ASSERT() that caused REDO of big blobs to fail Removed some historical ifdefs that caused problem with windows compilations BUILD/SETUP.sh: Added --with-maria-tmp-tables include/maria.h: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option include/my_base.h: Added comment mysql-test/r/maria-big.result: Added test that uses big blobs mysql-test/t/maria-big.test: Added test that uses big blobs sql/mysqld.cc: Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables sql/sql_class.h: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined sql/sql_select.cc: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined storage/maria/ha_maria.cc: Fixed compiler warnings reported by MCC - Fixed usage of wrong types that caused data loss - Changed parameter for rep_quick to my_bool - Added safe casts Fixed indentation storage/maria/ma_bitmap.c: Use pagecache_page_no_t as type for pages Fixed compiler warnings Fixed bug that caused update of big blobs to crash storage/maria/ma_blockrec.c: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option Fixed compiler warnings Fixed wrong DBUG_ASSERT() storage/maria/ma_blockrec.h: Use pagecache_page_no_t as type for pages storage/maria/ma_check.c: Fixed some wrong parameters where we didn't get all bits for test_flag Changed rep_quick to be of type my_bool Use pagecache_page_no_t as type for pages Added cast's to get rid of compiler warnings Changed type of record_pos to get rid of compiler warning storage/maria/ma_create.c: Added safe cast's to get rid of compiler warnings storage/maria/ma_dynrec.c: Fixed usage of wrong type storage/maria/ma_key.c: Fixed compiler warning storage/maria/ma_key_recover.c: Use pagecache_page_no_t as type for pages storage/maria/ma_loghandler_lsn.h: Added cast's to get rid of compiler warnings storage/maria/ma_page.c: Changed variable name from 'page' to 'pos' as it was an offset and not a page address Moved page_size inside block to get rid of compiler warning storage/maria/ma_pagecache.c: Fixed compiler warnings Replaced compile time assert with TODO storage/maria/ma_pagecache.h: Use pagecache_page_no_t as type for pages storage/maria/ma_pagecrc.c: Allow bitmap pages that is all zero storage/maria/ma_preload.c: Added cast to get rid of compiler warning storage/maria/ma_recovery.c: Changed types to get rid of compiler warnings Use bool for quick_repair to get rid of compiler warning Fixed some variables that was wrongly declared (not enough precission) Added cast to get rid of compiler warning storage/maria/ma_test2.c: Remove historical undefs storage/maria/maria_chk.c: Changed rep_quick to bool Fixed wrong parameter to maria_chk_data_link() storage/maria/maria_def.h: Use pagecache_page_no_t as type for pages storage/maria/maria_pack.c: Renamed isam -> maria storage/maria/plug.in: Added option --with-maria-tmp-tables storage/maria/trnman.c: Added cast to get rid of compiler warning storage/myisam/mi_test2.c: Remove historical undefs
2008-01-10 20:21:36 +01:00
percentage_done= (uint) ((initial_remainder - local_remainder) * ULL(100) /
initial_remainder);
if ((percentage_done - percentage_printed) >= 10)
{
percentage_printed= percentage_done;
Added --with-maria-tmp-tables (default one) to allow on to configure if Maria should be used for internal temporary tables Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables Fixed bug that caused update of big blobs to crash Use pagecache_page_no_t as type for pages (to get rid of compiler warnings) Added cast to get rid of compiler warning Fixed wrong types of variables and arguments that caused lost information Fixed wrong DBUG_ASSERT() that caused REDO of big blobs to fail Removed some historical ifdefs that caused problem with windows compilations BUILD/SETUP.sh: Added --with-maria-tmp-tables include/maria.h: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option include/my_base.h: Added comment mysql-test/r/maria-big.result: Added test that uses big blobs mysql-test/t/maria-big.test: Added test that uses big blobs sql/mysqld.cc: Abort mysqld if Maria engine didn't start and we are using Maria for temporary tables sql/sql_class.h: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined sql/sql_select.cc: Don't use Maria for temporary tables if --with-maria-tmp-tables is not defined storage/maria/ha_maria.cc: Fixed compiler warnings reported by MCC - Fixed usage of wrong types that caused data loss - Changed parameter for rep_quick to my_bool - Added safe casts Fixed indentation storage/maria/ma_bitmap.c: Use pagecache_page_no_t as type for pages Fixed compiler warnings Fixed bug that caused update of big blobs to crash storage/maria/ma_blockrec.c: Use pagecache_page_no_t as type for pages Use my_bool as parameter for 'rep_quick' option Fixed compiler warnings Fixed wrong DBUG_ASSERT() storage/maria/ma_blockrec.h: Use pagecache_page_no_t as type for pages storage/maria/ma_check.c: Fixed some wrong parameters where we didn't get all bits for test_flag Changed rep_quick to be of type my_bool Use pagecache_page_no_t as type for pages Added cast's to get rid of compiler warnings Changed type of record_pos to get rid of compiler warning storage/maria/ma_create.c: Added safe cast's to get rid of compiler warnings storage/maria/ma_dynrec.c: Fixed usage of wrong type storage/maria/ma_key.c: Fixed compiler warning storage/maria/ma_key_recover.c: Use pagecache_page_no_t as type for pages storage/maria/ma_loghandler_lsn.h: Added cast's to get rid of compiler warnings storage/maria/ma_page.c: Changed variable name from 'page' to 'pos' as it was an offset and not a page address Moved page_size inside block to get rid of compiler warning storage/maria/ma_pagecache.c: Fixed compiler warnings Replaced compile time assert with TODO storage/maria/ma_pagecache.h: Use pagecache_page_no_t as type for pages storage/maria/ma_pagecrc.c: Allow bitmap pages that is all zero storage/maria/ma_preload.c: Added cast to get rid of compiler warning storage/maria/ma_recovery.c: Changed types to get rid of compiler warnings Use bool for quick_repair to get rid of compiler warning Fixed some variables that was wrongly declared (not enough precission) Added cast to get rid of compiler warning storage/maria/ma_test2.c: Remove historical undefs storage/maria/maria_chk.c: Changed rep_quick to bool Fixed wrong parameter to maria_chk_data_link() storage/maria/maria_def.h: Use pagecache_page_no_t as type for pages storage/maria/maria_pack.c: Renamed isam -> maria storage/maria/plug.in: Added option --with-maria-tmp-tables storage/maria/trnman.c: Added cast to get rid of compiler warning storage/myisam/mi_test2.c: Remove historical undefs
2008-01-10 20:21:36 +01:00
fprintf(stderr, " %u%%", percentage_done);
WL#3072 - Maria Recovery Bulk insert: don't log REDO/UNDO for rows, log one UNDO which will truncate files; this is an optimization and a bugfix (table was left half-repaired by crash). Repair: mark table crashed-on-repair at start, bump skip_redo_lsn at start, this is easier for recovery (tells it to skip old REDOs or even UNDO phase) and user (tells it to repair) in case of crash, sync files in the end. Recovery skips missing or corrupted table and moves to next record (in REDO or UNDO phase) to be more robust; warns if happens in UNDO phase. Bugfix for UNDO_KEY_DELETE_WITH_ROOT (tested in ma_test_recovery) and maria_enable_indexes(). Create missing bitmaps when needed (there can be more than one to create, in rare cases), log a record for this. include/myisamchk.h: new flag: bulk insert repair mustn't bump create_rename_lsn mysql-test/lib/mtr_report.pl: skip normal warning in maria-recovery.test mysql-test/r/maria-recovery.result: result: crash before bulk insert is committed, causes proper rollback, and crash right after OPTIMIZE replaces index file with new index file leads to table marked corrupted and recovery not failing. mysql-test/t/maria-recovery.test: - can't check the table or it would commit the transaction, but check is made after recovery. - test of crash before bulk-insert-with-repair is committed (to see if it is rolled back), and of crash after OPTIMIZE has replaced index file but not finished all operations (to see if recovery fails - it used to assert when trying to execute an old REDO on the new index). storage/maria/CMakeLists.txt: new file storage/maria/Makefile.am: new file storage/maria/ha_maria.cc: - If bulk insert on a transactional table using an index repair: table is initially empty, so don't log REDO/UNDO for data rows (optimization), just log an UNDO_BULK_INSERT_WITH_REPAIR which will, if executed, empty the data and index file. Re-enable logging in end_bulk_insert(). - write log record for repair operation only after it's fully done, index sort including (maria_repair*() used to write the log record). - Adding back file->trn=NULL which was removed by mistake earlier. storage/maria/ha_maria.h: new member (see ha_maria.cc) storage/maria/ma_bitmap.c: Functions to create missing bitmaps: - one function which creates missing bitmaps in page cache, except the missing one with max offset which it does not put into page cache as it will be modified very soon. - one function which the one above calls, and creates bitmaps in page cache - one function to execute REDO_BITMAP_NEW_PAGE which uses the second one above. storage/maria/ma_blockrec.c: - when logging REDO_DELETE_ALL, not only 'records' and 'checksum' has to be reset under log's mutex. - execution of REDO_INSERT_ROW_BLOBS now checks the dirty pages' list - execution of UNDO_BULK_INSERT_WITH_REPAIR storage/maria/ma_blockrec.h: new functions storage/maria/ma_check.c: - table-flush-before-repair is moved to a separate function reused by maria_sort_index(); syncing is added - maria_repair() is allowed to re-enable logging only if it is the one which disabled it. - "_ma_flush_table_files_after_repair" was a bad name, it's not after repair now, and it should not sync as we do more changes to the files shortly after (sync is postponed to when writing the log record) - REDO_REPAIR record should be written only after all repair operations (in particular after sorting index in ha_mara::repair()) - close to the end of repair by sort, flushing of pages must happen also in the non-quick case, to prepare for the sync at end. - in parallel repair, some page flushes are not needed as done by initialize_variables_for_repair(). storage/maria/ma_create.c: Update skip_redo_lsn, create_rename_lsn optionally. storage/maria/ma_delete_all.c: Need to sync files at end of maria_delete_all_rows(), if transactional. storage/maria/ma_extra.c: During repair, we sometimes call _ma_flush_table_files() (via _ma_flush_table_files_before_swap()) while there is a WRITE_CACHE. storage/maria/ma_key_recover.c: - when we see CLR_END for UNDO_BULK_INSERT_WITH_REPAIR, re-enable indices. - fixing bug: _ma_apply_undo_key_delete() parsed UNDO_KEY_DELETE_WITH_ROOT wrongly, leading to recovery failure storage/maria/ma_key_recover.h: new prototype storage/maria/ma_locking.c: DBUG_VOID_RETURN missing storage/maria/ma_loghandler.c: UNDO for bulk insert with repair, and REDO for creating bitmaps. LOGREC_FIRST_FREE to not have to change the for() every time we add a new record type. storage/maria/ma_loghandler.h: new UNDO and REDO storage/maria/ma_open.c: Move share.kfile.file=kfile up a bit, so that _ma_update_state_lsns() can get its value, this fixes a bug where LSN_REPAIRED_BY_MARIA_CHK was not corrected on disk by maria_open(). Store skip_redo_lsn in index' header. maria_enable_indexes() had a bug for BLOCK_RECORD, where an empty file has one page, not 0 bytes. storage/maria/ma_recovery.c: - Skip a corrupted, missing, or repaired-with-maria_chk, table in recovery: don't fail, just go to next REDO or UNDO; but if an UNDO is skipped in UNDO phase we issue warnings. - Skip REDO|UNDO in REDO phase if <skip_redo_lsn. - If UNDO phase fails, delete transactions to not make trnman assert. - Update skip_redo_lsn when playing REDO_CREATE_TABLE - Don't record UNDOs for old transactions which we don't know (long_trid==0) - Bugfix for UNDO_KEY_DELETE_WITH_ROOT (see ma_key_recover.c) - Execution of UNDO_BULK_INSERT_WITH_REPAIR - Don't try to find a page number in REDO_DELETE_ALL - Pieces moved to ma_recovery_util.c storage/maria/ma_rename.c: name change storage/maria/ma_static.c: I modified layout of the index' header (inserted skip_redo_lsn in its middle) storage/maria/ma_test2.c: allow breaking the test towards the end, tests execution of UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_test_recovery.expected: 6 as testflag instead of 4 storage/maria/ma_test_recovery: Increase the amount of rollback work to do when testing recovery with ma_test2; this reproduces the UNDO_KEY_DELETE_WITH_ROOT bug. storage/maria/maria_chk.c: skip_redo_lsn should be updated too, for consistency. Write a REDO_REPAIR after all operations (including sort-records) have been done. No reason to flush blocks after maria_chk_data_link() and maria_sort_records(), there is maria_close() in the end. write_log_record() is a function, to not clutter maria_chk(). storage/maria/maria_def.h: New member skip_redo_lsn in the state, and comments storage/maria/maria_pack.c: skip_redo_lsn should be updated too, for consistency storage/maria/ma_recovery_util.c: _ma_redo_not_needed_for_page(), defined in ma_recovery.c, is needed by ma_blockrec.c; this causes link issues, resolved by putting _ma_redo_not_needed_for_page() into a new file (so that it is not in the same file as repair-related objects of ma_recovery.c). storage/maria/ma_recovery_util.h: new file
2008-01-17 23:59:32 +01:00
fflush(stderr);
Added error HA_ERR_FILE_TOO_SHORT to be used when files are shorter than expected (by my_read/my_pread) Added debugger hook _my_dbug_put_break_here() that is called if we get a CRC that matches --debug-crc-break (my_crc_dbug_break) Fixed REDO_REPAIR to use all repair modes (repair, repair_by_sort, repair_paralell REDO_REPAIR now also logs used key map Fixed some bugs in REDO logging of key pages Better error messages from maria_read_log Added my_readwrite_flags to init_pagecache() to be able to get better error messages and simplify code. Don't allow pagecaches with less than 8 blocks (Causes strange crashes) Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums (these are calculated and checked in DBUG mode, ignored otherwise) Fixed bug in ma_pagecache unit tests that caused program to sometimes fail Added some missing calls to MY_INIT() that caused some unit tests to fail Fixed that TRUNCATE works properly on temporary MyISAM files Updates some result files to new table checksums results (checksum when NULL fields are ignored) perl test-insert can be replayed with maria_read_log! sql/share/Makefile.am: Change mode to -rw-rw-r-- BitKeeper/etc/ignore: added storage/maria/unittest/page_cache_test_file_1 storage/maria/unittest/pagecache_debug.log include/maria.h: Added maria_tmpdir include/my_base.h: Added error HA_ERR_FILE_TOO_SHORT include/my_sys.h: Added variable my_crc_dbug_check Added function my_dbug_put_break_here() include/myisamchk.h: Added org_key_map (Needed for writing REDO record for REPAIR) mysql-test/r/innodb.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/mix2_myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/r/myisam.result: Updated to new checksum algorithm (NULL ignored) mysql-test/t/myisam.test: Added used table mysys/checksum.c: Added DBUG for checksum results Added debugger hook so that _my_dbug_put_break_here() is called if we get matching CRC mysys/lf_alloc-pin.c: Fixed compiler warning mysys/my_handler.c: Added new error message mysys/my_init.c: If my_progname is not given, use 'unknown' form my_progname_short Added debugger function my_debug_put_break_here() mysys/my_pread.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT mysys/my_read.c: In case of too short file when MY_NABP or MY_FNABP is specified, give error HA_ERR_FILE_TO_SHORT sql/mysqld.cc: Added debug option --debug-crc-break sql/sql_parse.cc: Trivial optimization storage/maria/ha_maria.cc: Renamed variable to be more logical Ensure that param.testflag is correct when calling repair Added extra argument to init_pagecache Set default value for maria_tempdir storage/maria/ma_blockrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_check.c: Set param->testflag to match how repair is run (needed for REDO logging) Simple optimization Moved flag if page is node from pagelength to keypage-flag byte Log used key map in REDO log. storage/maria/ma_delete.c: Remember previous UNDO entry when writing undo (for future CLR records) Moved flag if page is node from pagelength to keypage-flag byte Fixed some bugs in redo logging Added CRC for some translog REDO_INDEX entries storage/maria/ma_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_ft_update.c: Fixed call to _ma_store_page_used() storage/maria/ma_key_recover.c: Added CRC for some translog REDO_INDEX entries Removed not needed pagecache_write() in _ma_apply_redo_index() storage/maria/ma_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/maria/ma_loghandler.c: Added used key map to REDO_REPAIR_TABLE storage/maria/ma_loghandler.h: Added operation for checksum of key pages storage/maria/ma_open.c: Allocate storage for undo lsn pointers storage/maria/ma_pagecache.c: Remove not needed include file Change logging to use fd: for file descritors as other code Added my_readwrite_flags to init_pagecache() to be able to get better error messages for maria_chk/maria_read_log Don't allow pagecaches with less than 8 blocks Remove wrong DBUG_ASSERT() storage/maria/ma_pagecache.h: Added readwrite_flags storage/maria/ma_recovery.c: Better error messages for maria_read_log: - Added eprint() for printing error messages - Print extra \n before error message if we are printing %0 %10 ... Added used key_map to REDO_REPAIR log entry More DBUG Call same repair method that was used by mysqld storage/maria/ma_rt_index.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_rt_key.c: Fixed call to _ma_store_page_used() storage/maria/ma_rt_split.c: Moved flag if page is node from pagelength to keypage-flag byte storage/maria/ma_static.c: Added maria_tmpdir storage/maria/ma_test1.c: Updated call to init_pagecache() storage/maria/ma_test2.c: Updated call to init_pagecache() storage/maria/ma_test3.c: Updated call to init_pagecache() storage/maria/ma_write.c: Removed #ifdef NOT_YET Moved flag if page is node from pagelength to keypage-flag byte Fixed bug in _ma_log_del_prefix() storage/maria/maria_chk.c: Fixed wrong min limit for page_buffer_size Updated call to init_pagecache() storage/maria/maria_def.h: Added EXTRA_DEBUG_KEY_CHANGES. When this is defined some REDO_INDEX entries contains page checksums Moved flag if page is node from pagelength to keypage-flag byte storage/maria/maria_ftdump.c: Updated call to init_pagecache() storage/maria/maria_pack.c: Updated call to init_pagecache() Reset share->state.create_rename_lsn & share->state.is_of_horizon storage/maria/maria_read_log.c: Better error messages Added --tmpdir option (needed to set temporary directory for REDO_REPAIR) Added --start-from-lsn Changed option for --display-only to 'd' (wanted to use -o for 'offset') storage/maria/unittest/lockman2-t.c: Added missing call to MY_INIT() storage/maria/unittest/ma_pagecache_consist.c: Updated call to init_pagecache() storage/maria/unittest/ma_pagecache_single.c: Fixed bug that caused program to sometimes fail Added some DBUG_ASSERTS() Changed some calls to malloc()/free() to my_malloc()/my_free() Create extra file to expose original hard-to-find bug storage/maria/unittest/ma_test_loghandler-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_multithread-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_noflush-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: Updated call to init_pagecache() storage/maria/unittest/ma_test_loghandler_purge-t.c: Updated call to init_pagecache() storage/maria/unittest/test_file.c: Changed malloc()/free() to my_malloc()/my_free() Fixed memory leak Changd logic a bit while trying to find bug in reset_file() storage/maria/unittest/trnman-t.c: Added missing call to MY_INIT() storage/myisam/mi_cache.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_create.c: Removed O_EXCL to get TRUNCATE to work for temporary files storage/myisam/mi_dynrec.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 storage/myisam/mi_locking.c: Test for HA_ERR_FILE_TOO_SHORT instead for -1 mysql-test/r/old-mode.result: New BitKeeper file ``mysql-test/r/old-mode.result'' mysql-test/t/old-mode-master.opt: New BitKeeper file ``mysql-test/t/old-mode-master.opt'' mysql-test/t/old-mode.test: New BitKeeper file ``mysql-test/t/old-mode.test''
2007-12-04 22:23:42 +01:00
procent_printed= 1;
}
}
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
#ifdef MARIA_EXTERNAL_LOCKING
WL#3072 - Maria recovery. * Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key).
2007-10-02 18:02:09 +02:00
#error Marias Checkpoint and Recovery are really not ready for it
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
#endif
/*
Recovery of the state : how it works
=====================================
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
Here we ignore Checkpoints for a start.
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
The state (MARIA_HA::MARIA_SHARE::MARIA_STATE_INFO) is updated in
memory frequently (at least at every row write/update/delete) but goes
to disk at few moments: maria_close() when closing the last open
instance, and a few rare places like CHECK/REPAIR/ALTER
(non-transactional tables also do it at maria_lock_database() but we
needn't cover them here).
In case of crash, state on disk is likely to be older than what it was
in memory, the REDO phase needs to recreate the state as it was in
memory at the time of crash. When we say Recovery here we will always
mean "REDO phase".
For example MARIA_STATUS_INFO::records (count of records). It is updated at
the end of every row write/update/delete/delete_all. When Recovery sees the
sign of such row operation (UNDO or REDO), it may need to update the records'
count if that count does not reflect that operation (is older). How to know
the age of the state compared to the log record: every time the state
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
goes to disk at runtime, its member "is_of_horizon" is updated to the
current end-of-log horizon. So Recovery just needs to compare is_of_horizon
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
and the record's LSN to know if it should modify "records".
Other operations like ALTER TABLE DISABLE KEYS update the state but
don't write log records, thus the REDO phase cannot repeat their
effect on the state in case of crash. But we make them sync the state
as soon as they have finished. This reduces the window for a problem.
It looks like only one thread at a time updates the state in memory or
First part of redo/undo for key pages Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows Checksum for MyISAM now ignores NULL and not used part of VARCHAR Renamed some variables that caused shadow compiler warnings Moved extra() call when waiting for tables to not be used to after tables are removed from cache. Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug. pagecache_unlock_by_ulink() now has extra argument to say if page was changed. Give error message if we fail to open control file Mark page cache variables as not flushable include/maria.h: Made min page cache larger (needed for pinning key page) Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion Added write_comp_flag to move some runtime code to maria_open() include/my_base.h: Added new error message to be used when handler initialization failed include/my_global.h: Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables include/my_handler.h: Added const to some parameters mysys/array.c: More DBUG mysys/my_error.c: Fixed indentation mysys/my_handler.c: Added const to some parameters Added missing error messages sql/field.h: Renamed variables to avoid variable shadowing sql/handler.h: Renamed parameter to avoid variable name conflict sql/item.h: Renamed variables to avoid variable shadowing sql/log_event_old.h: Renamed variables to avoid variable shadowing sql/set_var.h: Renamed variables to avoid variable shadowing sql/sql_delete.cc: Removed maria hack for temporary tables Fixed indentation sql/sql_table.cc: Moved extra() call when waiting for tables to not be used to after tables are removed from cache. This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use. sql/table.cc: Copy page_checksum from share Removed Maria hack storage/maria/Makefile.am: Added new files storage/maria/ha_maria.cc: Renamed records -> record_count and info -> create_info to avoid variable name conflicts Mark page cache variables as not flushable storage/maria/ma_blockrec.c: Moved _ma_unpin_all_pages() to ma_key_recover.c Moved init of info->pinned_pages to ma_open.c Moved _ma_finalize_row() to maria_key_recover.h Renamed some variables to avoid variable name conflicts Mark page_link.changed for blocks we change directly Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index) storage/maria/ma_blockrec.h: Removed extra empty line storage/maria/ma_checkpoint.c: Remove not needed trnman.h storage/maria/ma_close.c: Free pinned pages (which are now always allocated) storage/maria/ma_control_file.c: Give error message if we fail to open control file storage/maria/ma_delete.c: Changes for redo logging (first part, logging of underflow not yet done) - Log undo-key-delete - Log delete of key - Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert() - Added new arguments to some functions to be able to write redo information - Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway Changed 2 bmove_upp() to bmove() as this made code easer to understand More function comments Indentation fixes storage/maria/ma_ft_update.c: New arguments to _ma_write_keypage() storage/maria/ma_loghandler.c: Fixed some DBUG_PRINT messages Simplify code Added new log entrys for key page redo Renamed some variables to avoid variable name shadowing storage/maria/ma_loghandler.h: Moved some defines here Added define for storing key number on key pages Added new translog record types Added enum for type of operations in LOGREC_REDO_INDEX storage/maria/ma_open.c: Always allocate info.pinned_pages (we need now also for normal key page usage) Update keyinfo->key_nr Added virtual functions to convert record position o number to be stored on key pages Update keyinfo->write_comp_flag to value of search flag to be used when writing key storage/maria/ma_page.c: Added redo for key pages - Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE - _ma_fetch_keypage() now pin's pages if needed - Extended _ma_write_keypage() with type of locks to be used - ma_dispose() now locks info->s->state.key_del from other threads - ma_dispose() writes redo log record - ma_new() locks info->s->state.key_del from other threads if it was used - ma_new() now pins read page Other things: - Removed some not needed arguments from _ma_new() and _ma_dispose) - Added some new variables to simplify code - If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes storage/maria/ma_pagecache.h: Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed Added some defines for pagecache priority levels that one can use storage/maria/ma_range.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_recovery.c: - Added hooks for new translog types: REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and UNDO_KEY_DELETE_WITH_ROOT. - Moved variable declarations to start of function (portability fixes) - Removed some not needed initializations - Set only relevant state changes for each redo/undo entry storage/maria/lockman.c: Removed end space storage/maria/ma_check.c: Removed end space storage/maria/ma_create.c: Removed end space storage/maria/ma_locking.c: Removed end space storage/maria/ma_packrec.c: Removed end space storage/maria/ma_pagecache.c: Removed end space storage/maria/ma_panic.c: Removed end space storage/maria/ma_rt_index.c: Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new() Fixed indentation storage/maria/ma_rt_key.c: Added new arguments for call to _ma_fetch_keypage() storage/maria/ma_rt_split.c: Added new arguments for call to _ma_new() Use new keypage header Added new arguments for call to _ma_write_keypage() storage/maria/ma_search.c: Updated comments & indentation Added new arguments for call to _ma_fetch_keypage() Made some variables and arguments const Added virtual functions for converting row position to number to be stored in key use MARIA_RECORD_POS of record position instead of my_off_t Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO) storage/maria/ma_sort.c: Removed end space storage/maria/ma_statrec.c: Updated arguments for call to _ma_rec_pos() storage/maria/ma_test1.c: Fixed too small buffer to init_pagecache() Fixed bug when using insert_count and test_flag storage/maria/ma_test2.c: Use more resonable pagecache size Remove not used code Reset blob_length to fix wrong output message storage/maria/ma_test_all.sh: Fixed wrong test storage/maria/ma_write.c: Lots of new code to handle REDO of key pages No logic changes because of REDO code, mostly adding new arguments and adding new code for logging Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open() Zerofill new used pages for: - To remove possible sensitive data left in buffer - To get idenitical data on pages after running redo - Better compression of pages if archived storage/maria/maria_chk.c: Added information if table is crash safe storage/maria/maria_def.h: New virtual function to convert between record position on key and normal record position Aded mutex and extra variables to handle locking of share->state.key_del Moved some structure variables to get things more aligned Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert Added argument to MARIA_PINNED_PAGE to indicate if page was changed Updated prototypes for functions Added some structures for signaling changes in REDO handling storage/maria/unittest/ma_pagecache_single.c: Updated arguments for changed function calls storage/myisam/mi_check.c: Made calc_check_checksum virtual storage/myisam/mi_checksum.c: Update checksums to ignore null columns storage/myisam/mi_create.c: Mark if table has null column (to know when we have to use mi_checksum()) storage/myisam/mi_open.c: Added virtual function for calculating checksum to be able to easily ignore NULL fields storage/myisam/mi_test2.c: Fixed bug storage/myisam/myisamdef.h: Added virtual function for calculating checksum during check table Removed ha_key_cmp() as this is in handler.h storage/maria/ma_key_recover.c: New BitKeeper file ``storage/maria/ma_key_recover.c'' storage/maria/ma_key_recover.h: New BitKeeper file ``storage/maria/ma_key_recover.h'' storage/maria/ma_key_redo.c: New BitKeeper file ``storage/maria/ma_key_redo.c''
2007-11-14 18:08:06 +01:00
on disk. We assume that the upper level (normally MySQL) has protection
against issuing HA_EXTRA_(FORCE_REOPEN|PREPARE_FOR_RENAME) so that these
are not issued while there are any running transactions on the given table.
If this is not done, we may write a corrupted state to disk.
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
With checkpoints
================
Checkpoint module needs to read the state in memory and write it to
disk. This may happen while some other thread is modifying the state
in memory or on disk. Checkpoint thus may be reading changing data, it
needs a mutex to not have it corrupted, and concurrent modifiers of
the state need that mutex too for the same reason.
"records" is modified for every row write/update/delete, we don't want
to add a mutex lock/unlock there. So we re-use the mutex lock/unlock
which is already present in these moments, namely the log's mutex which is
taken when UNDO_ROW_INSERT|UPDATE|DELETE is written: we update "records" in
under-log-mutex hooks when writing these records (thus "records" is
not updated at the end of maria_write/update/delete() anymore).
Thus Checkpoint takes the log's lock and can read "records" from
memory an write it to disk and release log's lock.
We however want to avoid having the disk write under the log's
lock. So it has to be under another mutex, natural choice is
intern_lock (as Checkpoint needs it anyway to read MARIA_SHARE::kfile,
and as maria_close() takes it too). All state writes to disk are
changed to be protected with intern_lock.
So Checkpoint takes intern_lock, log's lock, reads "records" from
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
memory, releases log's lock, updates is_of_horizon and writes "records" to
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
disk, release intern_lock.
In practice, not only "records" needs to be written but the full
state. So, Checkpoint reads the full state from memory. Some other
thread may at this moment be modifying in memory some pieces of the
state which are not protected by the lock's log (see ma_extra.c
HA_EXTRA_NO_KEYS), and Checkpoint would be reading a corrupted state
from memory; to guard against that we extend the intern_lock-zone to
changes done to the state in memory by HA_EXTRA_NO_KEYS et al, and
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
also any change made in memory to create_rename_lsn/state_is_of_horizon.
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
Last, we don't want in Checkpoint to do
log lock; read state from memory; release log lock;
for each table, it may hold the log's lock too much in total.
So, we instead do
log lock; read N states from memory; release log lock;
Thus, the sequence above happens outside of any intern_lock.
But this re-introduces the problem that some other thread may be changing the
state in memory and on disk under intern_lock, without log's lock, like
HA_EXTRA_NO_KEYS, while we read the N states. However, when Checkpoint later
comes to handling the table under intern_lock, which is serialized with
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
HA_EXTRA_NO_KEYS, it can see that is_of_horizon is higher then when the state
was read from memory under log's lock, and thus can decide to not flush the
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
obsolete state it has, knowing that the other thread flushed a more recent
WL#3071 Maria checkpoint Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
2007-09-12 11:27:34 +02:00
state already. If on the other hand is_of_horizon is not higher, the read
state is current and can be flushed. So we have a per-table sequence:
lock intern_lock; test if is_of_horizon is higher than when we read the state
- WL#3072 Maria Recovery: Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name
2007-09-07 15:02:30 +02:00
under log's lock; if no then flush the read state to disk.
*/
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
/* some comments and pseudo-code which we keep for later */
#if 0
/*
MikaelR suggests: support checkpoints during REDO phase too: do checkpoint
after a certain amount of log records have been executed. This helps
against repeated crashes. Those checkpoints could not be user-requested
(as engine is not communicating during the REDO phase), so they would be
automatic: this changes the original assumption that we don't write to the
log while in the REDO phase, but why not. How often should we checkpoint?
*/
/*
We want to have two steps:
engine->recover_with_max_memory();
next_engine->recover_with_max_memory();
engine->init_with_normal_memory();
next_engine->init_with_normal_memory();
So: in recover_with_max_memory() allocate a giant page cache, do REDO
phase, then all page cache is flushed and emptied and freed (only retain
small structures like TM): take full checkpoint, which is useful if
next engine crashes in its recovery the next second.
Destroy all shares (maria_close()), then at init_with_normal_memory() we
do this:
*/
/**** UNDO PHASE *****/
/*
Launch one or more threads to do the background rollback. Don't wait for
them to complete their rollback (background rollback; for debugging, we
can have an option which waits). Set a counter (total_of_rollback_threads)
to the number of threads to lauch.
Note that InnoDB's rollback-in-background works as long as InnoDB is the
last engine to recover, otherwise MySQL will refuse new connections until
the last engine has recovered so it's not "background" from the user's
point of view. InnoDB is near top of sys_table_types so all others
(e.g. BDB) recover after it... So it's really "online rollback" only if
InnoDB is the only engine.
*/
/* wake up delete/update handler */
/* tell the TM that it can now accept new transactions */
/*
mark that checkpoint requests are now allowed.
*/
WL#3072 - Maria recovery Unit test for recovery: runs ma_test1 and ma_test2 (both only with INSERTs and DELETEs; UPDATEs disabled as not handled by recovery) then moves the tables elswhere; recreates tables from the log, and compares and fails if there is a difference. Passes now. Most of maria_read_log.c moved to ma_recovery.c, as it will be re-used for recovery-from-ha_maria. Bugfixes of applying of REDO_INSERT, REDO_PURGE_ROW. Applying of REDO_PURGE_BLOCKS, REDO_DELETE_ALL, REDO_DROP_TABLE, UNDO_ROW_INSERT (in REDO phase only, i.e. just doing records++), UNDO_ROW_DELETE, UNDO_ROW_PURGE. Code cleanups. Monty: please look for "QQ". Sanja: please look for "Sanja". Future tasks: recovery of the bitmap (easy), recovery of the state (make it idempotent), more REDOs (Monty to work on REDO_UPDATE?), UNDO phase... Pushing this cset as it looks safe, contains test and bugfixes which will help Monty implement applying of REDO_UPDATE. sql/handler.cc: typo storage/maria/Makefile.am: Adding ma_test_recovery (which ma_test_all invokes, and which can also be run alone). Most of maria_read_log.c moved to ma_recovery.c storage/maria/ha_maria.cc: comments storage/maria/ma_bitmap.c: fixing comments. 2 -> sizeof(maria_bitmap_marker). Bitmap-related part of _ma_initialize_datafile() moves in bitmap module. Now putting the "bm" signature when creating the first bitmap page (it used to happen only at next open, but that caused an annoying difference when testing Recovery if the original run didn't open the table, and it looks more logical like this: it goes to disk only with its signature correct); see the "QQ" comment towards the _ma_initialize_data_file() call in ma_create.c for more). When reading a bitmap page, verify its signature (happens when normally using the table or when CHECKing it; not when REPAIRing it). storage/maria/ma_blockrec.c: * no need to sync the data file if table is not transactional * Comments, code cleanup (log-related data moved to log-related code block, int5store->page_store). * Store the table's short id into LOGREC_UNDO_ROW_PURGE, like we do for other records (though this record will soon be replaced with a CLR). * If "page" is 1 it means the page which extends from byte page*block_size+1 to (page+1)*block_size (byte number 1 being the first byte of the file). The last byte of the file is data_file_length (same convention). A new page needs to be created if the last byte of the page is beyond the last byte of the file, i.e. (page+1)*block_size+1 > data_file_length, so we correct the test (bug found when testing log applying for ma_test1 -M -T --skip-update). * update the page's LSN when removing a row from it during execution of a REDO_PURGE_ROW record (bug found when testing log applying for ma_test1 -M -T --skip-update). * applying of REDO_PURGE_BLOCKs (limited to a one-page range for now). storage/maria/ma_blockrec.h: new functions. maria_bitmap_marker does not need to be exported. storage/maria/ma_close.c: we can always flush the table's state when closing the last instance of the table. And it is needed for maria_read_log (as it does not use maria_lock_database()). storage/maria/ma_control_file.c: when in Recovery, some assertions should not be used. storage/maria/ma_control_file.h: double-inclusion safe storage/maria/ma_create.c: during recovery, don't log records. Comments. Moving the creation of the first bitmap page to ma_bitmap.c storage/maria/ma_delete_table.c: during recovery, don't log records. Log the end-zero of the dropped table's name, so that recovery can use the string in place without extending it to fit an end zero. storage/maria/ma_loghandler.c: * inwrite_rec_hook also needs access to the MARIA_SHARE, like prewrite_rec_hook. This will be needed to update share->records_diff (in the upcoming patch "recovery of the state"). * LOG_DESC::record_ends_group changed to an enum. * LOG_DESC for LOGREC_REDO_PURGE_BLOCKS and LOGREC_UNDO_ROW_PURGE corrected * Sanja please see the @todo LOG BUG * avoiding DBUG_RETURN(func()) as it gives confusing debug traces. storage/maria/ma_loghandler.h: - log write hooks called while the log's lock is held (inwrite_rec_hook) now need the MARIA_SHARE, like prewrite_rec_hook already had - instead of a bool saying if this record's type ends groups or not, we refine: it may not end a group, it may end a group, or it may be a group in itself. Imagine that we had a physical write failure to a table before we log the UNDO, we still end up in external_lock(F_UNLCK) and then we log a COMMIT: we don't want to consider this COMMIT as ending the group of REDOs (don't want to execute those REDOs during Recovery), that's why we say "COMMIT is a group in itself, it aborts any previous group". This also gives one more sanity check in maria_read_log. storage/maria/ma_recovery.c: New Recovery code, replacing the old pseudocode. Most of maria_read_log moved here. Call-able from ha_maria, but not enabled yet. Compared to the previous version of maria_read_log, some bugs have been fixed, debugging output can go to stdout or a disk file (for now it's useful for me, later it can be changed), execution of REDO_DROP_TABLE, REDO_DELETE_ALL, REDO_PURGE_BLOCKS has been added. Duplicate code has been factored into functions. We abort an unfinished group of records if we see a record which is a group in itself (like COMMIT). No need for maria_panic() after a bug (which caused tables to not be closed) was fixed; if there is yet another bug I prefer to see it. When opening a table for Recovery, set data_file_length and key_file_length to their real physical value (these are the easiest state members to restore :). Warn us if the last page was truncated (but Recovery handles it). MARIA_SHARE::state::state::records is now partly recovered (not idempotent, but works if recreating tables from scracth). When applying a REDO to a page, stamp it with the UNDO's LSN (current_group_end_lsn), not with the REDO's LSN; it makes the table more identical to the original table (easier to compare the two tables in the end). Big thing missing: some types of REDOs are not handled, and the UNDO phase does not exist (missing functions to execute UNDOs to actually rollback). So for now tests are only inserting/deleting a few 100 rows, closing the table and seeing if the log is applied ok; it works. UPDATE not handled. storage/maria/ma_recovery.h: new functions: ma_recover() for recovery from inside ha_maria; _ma_apply_log() for maria_read_log (ma_recover() calls _ma_apply_log()). Btw, we need to not use the word "recover" for REPAIR/maria_chk anymore. storage/maria/ma_rename.c: don't write log records during recovery storage/maria/ma_test2.c: - fail if maria_info() or other subtests find some wrong information - new option -g to skip updates. - init the translog before creating the table, so that log applying can work. - in "#if 0" you'll see some fixed bugs (will be removed). storage/maria/ma_test_all.sh: cleanup files. Test log applying. storage/maria/maria_read_log.c: most of the logic moves to ma_recovery.c to be shared between maria_read_log and recovery-from-inside-mysqld. See ma_recovery.c for additional changes made to the moved code. storage/maria/ma_test_recovery: unit test for Recovery. Tests insert and delete, REDO_UPDATE not yet coded. Script is called from ma_test_all. Can run standalone.
2007-07-26 11:56:21 +02:00
#endif