mirror of
https://github.com/MariaDB/server.git
synced 2025-01-23 15:24:16 +01:00
15 commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
unknown
|
4e0964cb04 |
Fixed repair_by_sort to work with BLOCK_RECORD
Fixed bugs in undo logging Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) Reserved place for reference-transid on key pages (for packing of transids) ALTER TABLE and INSERT ... SELECT now uses fast creation of index Known bugs: ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix) ma_test_recovery.excepted is not totally correct, because of the above bug mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate include/maria.h: Prototype changes Added current_filepos to st_maria_sort_info mysql-test/r/maria.result: Updated results that changes as alter table and insert ... select now uses fast creation of index mysys/mf_iocache.c: Reset variable to gurard against double invocation storage/maria/ma_bitmap.c: Added _ma_bitmap_reset_cache() (needed for repair) storage/maria/ma_blockrec.c: Simplify code More initial allocations Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read) storage/maria/ma_blockrec.h: Moved TRANSID_SIZE to maria_def.h Added prototype for new functions storage/maria/ma_check.c: Simplicy code Fixed repair_by_sort to work with BLOCK_RECORD - When using BLOCK_RECORD or UNPACK create new Maria handle - Use common initializer function - Align code with maria_repair() Made some changes to maria_repair_parallel() to use common initializer function Removed ASK_MONTY section by fixing noted problem storage/maria/ma_close.c: Moved check for readonly to _ma_state_info_write() storage/maria/ma_key_recover.c: Use different log entries if key root changes or not. This fixed some bugs when tree grows storage/maria/ma_key_recover.h: Added keynr to st_msg_to_write_hook_for_undo_key storage/maria/ma_loghandler.c: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_loghandler.h: Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT storage/maria/ma_open.c: Added TRANSID to all key pages (for future compressing of trans id's) For compressed records, alloc a bit bigger buffer to avoid valgrind warnings If table is opened readonly, don't update state storage/maria/ma_packrec.c: Allocate bigger array for bit unpacking to avoid valgrind errors storage/maria/ma_recovery.c: Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT storage/maria/ma_sort.c: More logging storage/maria/ma_test_all.sh: More tests storage/maria/ma_test_recovery.expected: Update results Note that this is not complete becasue of a bug in recovery storage/maria/ma_test_recovery: Removed recreation of index (not needed when we have redo for index pages) storage/maria/maria_chk.c: When using flag --read-only, don't update status for files When using --unpack, don't use REPAIR_BY_SORT if other repair option is given Enable repair_by_sort for BLOCK records Removed not needed newline at start of --describe storage/maria/maria_def.h: Support for TRANSID_SIZE to key pages storage/maria/maria_read_log.c: renamed --only-display to --display-only |
||
unknown
|
6b3743f0aa |
Fixes for redo/undo logging of key pages
New extendable format for maria_log_control file Fixed some compiler warnings include/maria.h: Added maria_disable_logging() and maria_enable_logging() mysql-test/include/maria_verify_recovery.inc: Updated tests now when key redo/undo works mysql-test/r/maria-recovery.result: Updated tests now when key redo/undo works storage/maria/ma_blockrec.c: Use unified CLR code Added rec_lsn for full pages Moved clr write hook to ma_key_recover.c Changed REDO code to keep pages pinned until undo Mark page_link's as changed storage/maria/ma_blockrec.h: Moved write_hook_for_clr_end() to ma_key_recover.c storage/maria/ma_check.c: Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE Fixed wrong warning when checking files after maria_pack When unpacking files, we have to use new keypos_to_recpos method When doing repair, we can disregard index key file pages in page cache storage/maria/ma_commit.c: Added simple enable/disable logging functions (Needed for recovery) storage/maria/ma_control_file.c: Make maria control file extendable without having to make it incompatible for older versions storage/maria/ma_control_file.h: New error messages Added CONTROL_FILE_VERSION storage/maria/ma_delete.c: Added redo/undo for key pages change_length -> changed_length to make things similar More comments & more DBUG storage/maria/ma_key_recover.c: Unified CLR method Moved here write_hook_for_clr_end() and common keypage log functions Changed REDO to keep pages pinned until undo Changed UNDO code to change key_root under log mutex storage/maria/ma_key_recover.h: New structures and functions storage/maria/ma_loghandler.c: Include needed files storage/maria/ma_open.c: Change maria_open() to use pread() instead of read() storage/maria/ma_page.c: Fixed bug in key_del handling Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined storage/maria/ma_pagecache.c: Indentation and spelling fixes More DBUG Added helper function: pagecache_block_link_to_buffer() storage/maria/ma_pagecache.h: Added pagecache_block_link_to_buffer() storage/maria/ma_recovery.c: Fixed state.changed Fixed that REDO keeps pages pinned until UNDO Some bug fixes from previous commit Fixes for UNDO/REDO of key pages storage/maria/ma_search.c: Fixed packing and storing of keys to provide more information to caller so that we can do efficent REDO logging of the changes. storage/maria/ma_test1.c: Fixed bug with not initialized variable storage/maria/ma_test2.c: Removed not used code storage/maria/ma_test_all.res: Updated results storage/maria/ma_test_all.sh: Changed one test to test more Removed timing tests as not relevant here storage/maria/ma_test_recovery.expected: Updated test result after redo/undo if key pages works storage/maria/ma_test_recovery: Updated test after redo/undo if key pages works storage/maria/ma_write.c: Moved some general log functions to ma_key_recover.c Fixed some bugs in undo Moved ma_log_split() to _ma_split_page() Small changes in some function arguments to be able to do redo logging storage/maria/maria_chk.c: disable logging while doing repair table storage/maria/maria_def.h: New function prototypes Move some structs and functions to ma_key_recover.c storage/maria/unittest/ma_control_file-t.c: Updated with patch from Sanja NOTE: This is not complete and need to be updated to new control file format storage/maria/unittest/ma_test_loghandler-t.c: Fixed compiler warning |
||
unknown
|
301ee8d9a3 |
Merge bk-internal.mysql.com:/home/bk/mysql-maria
into mysql.com:/home/my/mysql-maria include/my_sys.h: Auto merged mysql-test/r/maria.result: Auto merged mysql-test/t/maria.test: Auto merged sql/handler.h: Auto merged sql/mysqld.cc: Auto merged storage/maria/ha_maria.cc: Auto merged storage/maria/ma_bitmap.c: Auto merged storage/maria/ma_blockrec.c: Auto merged storage/maria/ma_loghandler.c: Auto merged storage/maria/ma_pagecache.c: Auto merged storage/maria/ma_test1.c: Auto merged storage/maria/ma_test_recovery.expected: Auto merged storage/maria/ma_test_recovery: Auto merged sql/mysql_priv.h: manual merge storage/maria/ma_recovery.c: manual merge storage/maria/ma_test2.c: manual merge |
||
unknown
|
13d53bf657 |
Merge some changes from sql directory in 5.1 tree
Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added redo_free_head_or_tail() & redo_insert_row_blobs() Added uuid to control file maria_checks now verifies that not used part of bitmap is 0 REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL Fixes problem when trying to read block outside of file during REDO include/my_global.h: STACK_DIRECTION is already set by configure mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: Test shrinking of VARCHAR mysys/my_realloc.c: Fixed indentation mysys/safemalloc.c: Fixed indentation sql/filesort.cc: Removed some casts sql/mysqld.cc: Added missing setting of myisam_stats_method_str sql/uniques.cc: Removed some casts storage/maria/ma_bitmap.c: Added printing of bitmap (for debugging) Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes() Added _ma_set_full_page_bits() Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files storage/maria/ma_blockrec.c: Changed format for REDO_INSERT_ROWS_BLOBS Fixed several bugs in handling of big blobs Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts) REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length) Fixed bug when doing insert in block outside of file size. Added redo_free_head_or_tail() & redo_insert_row_blobs() Added pagecache_unlock_by_link() when read fails. Much more comments, DBUG and ASSERT entries storage/maria/ma_blockrec.h: Prototypes of new functions Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE storage/maria/ma_check.c: Verify that not used part of bitmap is 0 storage/maria/ma_control_file.c: Added uuid to control file storage/maria/ma_loghandler.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_loghandler.h: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_pagecache.c: If we write full block, remove error flag for block. (Fixes problem when trying to read block outside of file) storage/maria/ma_recovery.c: REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS Added REDO_FREE_HEAD_OR_TAIL storage/maria/ma_test1.c: Allow option after 'b' to be compatible with ma_test2 (This is just to simplify test scripts like ma_test_recovery) storage/maria/ma_test2.c: Default size of blob is now 1000 instead of 1 storage/maria/ma_test_all.sh: Added test for bigger blobs storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Added test for bigger blobs |
||
unknown
|
0f1feefa03 |
WL#3071 Maria checkpoint
Ability for flush_pagecache_blocks() to flush only certain pages of a file, as instructed by an option "filter" pointer-to-function argument; Checkpoint and background dirty page flushing use that to flush only pages which have been dirty for long enough and bitmap pages. Fix for a bug in flush_cached_blocks() (no idea if it could produce a bug in real life, but theoretically it is). Testing checkpoint in ma_test_recovery via ma_test1 and ma_test2. Background checkpoint & dirty pages flush thread is still disabled by default in ha_maria. mysql-test/r/maria.result: result update storage/maria/ha_maria.cc: blank after function comment storage/maria/ma_checkpoint.c: Using an enum instead of 0/1/2 (applying Sanja's review comments). The comment about "this is an horizon" can be removed as Sanja created translog_next_LSN() which parse_checkpoint_record() uses. Variables in ma_checkpoint_background() cannot be declared in the for() as their value must not be reset at each iteration! storage/maria/ma_pagecache.c: adding to flush_pagecache_blocks() optional arguments 'filter' (pointer to function) and 'filter_arg'; if filter!=NULL this function will be called for each block of the file and will reply if this block and following ones should be flushed or not (3 possible replies). Fixing a bug when flush_cached_blocks() skips a pinned page: it has to unset PCBLOCK_IN_FLUSH set by flush_pagecache_blocks_int(). storage/maria/ma_pagecache.h: flush_pagecache_blocks() is changed to take "filter" and "filter_arg" arguments. "filter", if it is not NULL, may return one value among enum pagecache_flush_filter_result. storage/maria/ma_recovery.c: open_count=0 when closing tables at the end of recovery. storage/maria/ma_test1.c: Optional checkpoints (-H#) at various stages (stages similar to --testflag), for testing of checkpoints. storage/maria/ma_test2.c: Optional checkpoints (-H#) at various stages (stages similar to -t), for testing of checkpoints. storage/maria/ma_test_recovery.expected: Result update: the results of the additional test run with -H# (checkpoints) are added here. They are exactly identical to without checkpoints except that the index's Root (printed by maria_chk) is more correct when using checkpoints. This is because checkpoint flushed the state, so it happens to be correct, while no-checkpoint does not flush the state, and recovery does not recover indexes so Root is never fixed. When we recover indices, this will go away. storage/maria/ma_test_recovery: We duplicate the loop of tests to add an additional run with checkpoints at various stages, to see if maria_read_log uses them fine. |
||
unknown
|
df30832d11 |
Merge bk-internal.mysql.com:/home/bk/mysql-maria
into mysql.com:/home/my/mysql-maria client/mysqladmin.cc: Auto merged include/maria.h: Auto merged include/my_sys.h: Auto merged include/mysql_com.h: Auto merged mysql-test/r/maria.result: Auto merged server-tools/instance-manager/listener.cc: Auto merged sql/handler.h: Auto merged sql/item_func.cc: Auto merged sql/item_func.h: Auto merged sql/item_strfunc.cc: Auto merged sql/mysql_priv.h: Auto merged sql/mysqld.cc: Auto merged sql/sql_class.cc: Auto merged sql/sql_class.h: Auto merged sql/sql_show.cc: Auto merged sql/sql_table.cc: Auto merged sql/table.cc: Auto merged sql/table.h: Auto merged storage/maria/ma_bitmap.c: Auto merged storage/maria/ma_blockrec.c: Auto merged storage/maria/ma_blockrec.h: Auto merged storage/maria/ma_check.c: Auto merged storage/maria/ma_create.c: Auto merged storage/maria/ma_delete.c: Auto merged storage/maria/ma_loghandler.h: Auto merged storage/maria/ma_open.c: Auto merged storage/maria/ma_search.c: Auto merged storage/maria/ma_sort.c: Auto merged storage/maria/ma_test2.c: Auto merged storage/maria/ma_test_recovery.expected: Auto merged storage/maria/ma_write.c: Auto merged storage/maria/maria_chk.c: Auto merged storage/maria/maria_pack.c: Auto merged include/my_base.h: Trivial manual merge libmysql/Makefile.shared: Trivial manual merge sql/sql_yacc.yy: Manual merge storage/maria/ha_maria.cc: Trivial manual merge storage/maria/ma_page.c: Trivial manual merge storage/maria/maria_def.h: Trivial manual merge |
||
unknown
|
496741d576 |
Moved randomize and my_rnd under mysys
Added my_uuid Added pre-support for PAGE_CHECKSUM Added syntax for CREATE ... PAGE_CHECKSUM=# TABLE_CHECKSUM=# Reserved place for page checksums on index, bitmap and block pages Added index number to header of index pages Added linked list for free directory entries (speeds up inserts with BLOCK format) Calculate checksums in original column order (fixes bug with checksum on rows with BLOCK format) Cleaned up all index handling to use 'info->s->keypage_header' (variable size) as the header for index pages (before this was '2') Added 0xffffffff to end of index and block data bases and 0xfffffffe at end of bitmap pages when page checksums are not enabled Added _ma_get_page_used() and _ma_get_used_and_node() to simplify index page header handling rec_per_key_part is now in double precision Reserved place in index file for my_guid and nulls_per_key_part Give error HA_ERR_NEW_FILE if trying to open a Maria file with new, not yet supported extensions Lots of renames to increase readability: randomize() -> my_rnd_init() st_maria_info -> st_maria_handler st_maria_info -> MARIA_HA st_maria_isaminfo -> st_maria_info rand_struct -> my_rand_struct rec_per_key_rows -> records_at_analyze client/mysqladmin.cc: rand_struct -> my_rrnd_struct include/maria.h: st_maria_info -> MARIA_HA st_maria_isaminfo -> st_maria_info Changed analyze statistics to be of double precission Changed offset to field to be 32bits instead of 64 (safe as a record without blobs can't be that big) include/my_base.h: Added HA_OPTION_PAGE_CHECKSUM & HA_CREATE_PAGE_CHECKSUM Fixed comments Added HA_ERR_NEW_FILE include/my_sys.h: Added prototypes and structures for my_uuid() and my_rnd() include/myisamchk.h: Changed some buffers to size_t Added possibility to have key statistics with double precission include/mysql_com.h: Move rand functions to mysys libmysql/Makefile.shared: Added my_rnd mysql-test/r/maria.result: Updated results mysql-test/t/maria.test: More tests for checksum mysys/Makefile.am: Added my_rnd.c and my_uuid.c server-tools/instance-manager/listener.cc: Fixed include order (my_global.h should always be first) server-tools/instance-manager/mysql_connection.cc: Fixed include order (my_global.h should always be first) Use my_rnd_init() server-tools/instance-manager/mysql_connection.h: rand_struct -> my_rand_struct sql/handler.h: Added flag for page checksums sql/item_func.cc: Use new my_rnd() interface sql/item_func.h: Use new my_rnd() interface sql/item_strfunc.cc: Use new my_rnd() interface sql/lex.h: Added PAGE_CHECKSUM and TABLE_CHECKSUM sql/mysql_priv.h: Use new my_rnd() interface sql/mysqld.cc: Use new my_rnd() interface sql/password.c: Move my_rnd() to mysys Use new my_rnd() interface sql/sql_class.cc: Use new my_rnd() interface sql/sql_class.h: Use new my_rnd() interface sql/sql_crypt.cc: Use new my_rnd() interface sql/sql_crypt.h: Use new my_rnd() interface sql/sql_show.cc: Simpler handling of ha_choice_values Added PAGE_CHECKSUM sql/sql_table.cc: Enable correct checksum handling (for now) if not running in compatible mode sql/sql_yacc.yy: Added table option PAGE_CHECKSUM Added future compatible table option TABLE_CHECKSUM (alias for CHECKSUM) Added 'choice' target to simplify code sql/table.cc: Store flag for PAGE_CHECKSUM sql/table.h: Added support for PAGE_CHECKSUM storage/maria/ha_maria.cc: Remove protection for incompatbile frm and MAI (Slow, not needed test) Rec_per_key is now in double Remember row type for table Give warning if one Maria uses another row type than requested Removed some old ASK_MONTY entries (added comments instead) Added handling of PAGE_CHECKSUM flags storage/maria/ma_bitmap.c: Added page checksums to bitmap pages Added special bitmap marker for bitmap pages (Used to find bugs when running without page checksums) storage/maria/ma_blockrec.c: Added a free-link list over directory entries. This makes insert of small rows faster as we don't have to scan the whole directory to find a not used entry. Moved SANITY_CHECKS to maria_def.h Simplify code by introducing dir_entry_pos() Added support for PAGE_CHECKSUM storage/maria/ma_blockrec.h: Added DIR_FREE_SIZE (linked list of free directory entries) Added PAGE_CHECKSUM Added 'dir_entry_pos()' storage/maria/ma_check.c: Check that index pages has correct index number Calculate rec_per_key with double precission Simplify code by using '_ma_get_used_and_node()' Check free directory list Remove wrong end \n from messages maria_data_on_page() -> _ma_get_page_used() maria_putint() -> _ma_store_page_used() rec_per_key_rows -> records_at_analyze storage/maria/ma_checksum.c: Calculate checksum in original column order storage/maria/ma_create.c: Store original column order in index file Reserve place for nulls_per_key_part (future) Added support for PAGE_CHECKSUM storage/maria/ma_dbug.c: Fixed wrong debug output of key of type 'ulong' storage/maria/ma_delete.c: maria_data_on_page() -> _ma_get_used_and_node() maria_data_on_page() -> _ma_get_page_used() maria_putint() -> _ma_store_page_used() Added page header (index key number) to all index pages Reserved page for checksum on index pages Use keypage_header storage/maria/ma_ft_update.c: maria_putint() -> _ma_store_page_used() Store key number at start of page storage/maria/ma_loghandler.h: st_maria_info -> MARIA_HA storage/maria/ma_open.c: rec_per_key is now in double precission Added 'nulls_per_key_part' Added 'extra_options' (flags for future) Added support for PAGE_CHECKSUM Give error HA_ERR_NEW_FILE when using unsupported maria extensions Added comments Add maria_uuid to index file Added functions to store and read column_nr map. Changed some functions to return my_bool instead of uint storage/maria/ma_page.c: Added checks that pages has correct key nr Store 0xffffffff in checksum position if page checksums are not enabled Moved key-page-delete link to take into account keypage header storage/maria/ma_preload.c: Remove old MyISAM dependent code When scanning pages, only add pages to page cache for the requested index storage/maria/ma_range.c: maria_data_on_page() -> _ma_get_used_and_node() Use keypage_header storage/maria/ma_rt_index.c: Fixed indentation storage/maria/ma_rt_index.h: Added support for dynamic index page header Reserved place for PAGE_CHECKSUM storage/maria/ma_rt_key.c: Fixed indentation maria_data_on_page() -> _ma_get_page_used() maria_putint() -> maria_store_page_used() storage/maria/ma_rt_mbr.c: Fixed indentation storage/maria/ma_rt_split.c: Fixed indentation maria_data_on_page () -> _ma_get_page_used() storage/maria/ma_rt_test.c: Fixed indentation storage/maria/ma_search.c: Remove support of using -1 as 'last used index' to _ma_check_index() maria_data_on_page() -> _ma_get_page_used() maria_data_on_page() -> _ma_get_used_and_node() Use keypage_header storage/maria/ma_sort.c: Changed some buffers to size_t Changed rec_per_key_part to double storage/maria/ma_static.c: Removed NEAR Added maria_uuid storage/maria/ma_test2.c: Moevd testflag == 2 to correct place Remove test of reading with index number -1 (not supported anymore) storage/maria/ma_test_recovery.expected: Updated results storage/maria/ma_test_recovery: Changed tmp table names so that one can run maria_chk on them storage/maria/ma_write.c: Fixed indentation Use keypage_header Store index number on index pages maria_putint() -> _ma_store_page_used() maria_data_on_page() -> ma_get_used_and_node() maria_data_on_page() -> _ma_get_page_used() Added PAGE_CHECKSUM Added Maria handler to some functions Removed some not needed casts storage/maria/maria_chk.c: Added error handling for HA_ERR_NEW_FILE Added information about page checksums rec_per_key_part changed to double maria_data_on_page() -> _ma_get_page_used() Use keypage_header storage/maria/maria_def.h: Added IDENTICAL_PAGES_AFTER_RECOVERY and SANITY_CHECKS Changed rec_per_key_part to double Added nulls_per_key_part rec_per_key_rows -> records_at_analyze st_maria_info -> MARIA_HA Reserve place for new statistics variables, uuid, checksums per page etc. Removed NEAR tags Changed some prototypes to use my_bool and size_t storage/maria/maria_pack.c: st_maria_info -> MARIA_HA Fixed indentation storage/myisam/mi_dbug.c: Fix wrong debug output for ULONG mysys/my_rnd.c: New BitKeeper file ``mysys/my_rnd.c'' mysys/my_uuid.c: New BitKeeper file ``mysys/my_uuid.c'' |
||
unknown
|
d0b9387b88 |
WL#3072 - Maria recovery.
* Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1) is achieved in this patch. The table's live checksum (info->s->state.state.checksum) is updated in inwrite_rec_hook's under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE and REDO_DELETE_ALL. The checksum variation caused by the operation is stored in these UNDOs, so that the REDO phase, when it sees such UNDOs, can update the live checksum if it is older (state.is_of_lsn is lower) than the record. It is also used, as a nice add-on with no cost, to do less row checksum computation during the UNDO phase (as we have it in the record already). Doing this work, it became pressing to move in-write hooks (write_hook_for_redo() et al) to ma_blockrec.c. The 'parts' argument of inwrite_rec_hook is unpredictable (it comes mangled at this stage, for example by LSN compression) so it is replaced by a 'void* hook_arg', which is used to pass down information, currently only to write_hook_for_clr_end() (previous undo_lsn and type of undone record). * If from ha_maria, we print to stderr how many seconds (with one fractional digit) the REDO phase took, same for UNDO phase and for final table close. Just to give an indication for debugging and maybe also for Support. storage/maria/ha_maria.cc: question for Monty storage/maria/ma_blockrec.c: * log in-write hooks (write_hook_for_redo() etc) move from ma_loghandler.c to here; this is natural: the hooks are coupled to their callers (functions in ma_blockrec.c). * translog_write_record() now has a new argument "hook_arg"; using it to pass down to write_hook_for_clr_end() the transaction's previous_undo_lsn and the type of the being undone record, and also to pass down to all UNDOs the live checksum variation caused by the operation. * If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE and in CLR_END the checksum variation ("delta") caused by the operation. For example if a DELETE caused the table's live checksum to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes, the value 333 (456-123). * Instead of hard-coded "1" as length of the place where we store the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE; use macros clr_type_store and clr_type_korr. * write_block_record() has a new parameter 'old_record_checksum' which is the pre-computed checksum of old_record; that value is used to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END. * In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE the row's checksum is already computed. * _ma_update_block_record2() now expect the new row's checksum into cur_row.checksum (was already true) and the old row's checksum into new_row.checksum (that's new). Its two callers, maria_update() and _ma_apply_undo_row_update(), honour this. * When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick up the checksum delta from the log record. It is then used to update the table's live checksum when writing CLR_END, and saves us a computation of record. storage/maria/ma_blockrec.h: in-write hooks move from ma_loghandler.c storage/maria/ma_check.c: more straightforward size of buffer storage/maria/ma_checkpoint.c: <= is enough storage/maria/ma_commit.c: new prototype of translog_write_record() storage/maria/ma_create.c: new prototype of translog_write_record() storage/maria/ma_delete.c: The row's checksum must be computed before calling(*delete_record)(), not after, because it must be known inside _ma_delete_block_record() (to update the table's live checksum when writing UNDO_ROW_DELETE). If deleting from a transactional table, live checksum was already updated when writing UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: @todo is now done (in ma_loghandler.c) storage/maria/ma_delete_table.c: new prototype of translog_write_record() storage/maria/ma_loghandler.c: * in-write hooks move to ma_blockrec.c. * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. * fix for compiler warning (unused buffer_start when compiling without debug support) * Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE and CLR_END, but only if the table has live checksum, these records are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their length is X if no live checksum and X+4 otherwise). * add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the table's live checksum. Update it also in hooks of UNDO_ROW_INSERT| DELETE and REDO_DELETE_ALL and CLR_END. * Bugfix: when reading a record in translog_read_record(), it happened that "length" became negative, because the function assumed that the record extended beyond the page's end, whereas it may be shorter. storage/maria/ma_loghandler.h: * Instead of hard-coded "1" and "4", use symbols and macros to store/retrieve the type of record which the CLR_END corresponds to, and the checksum variation caused by the operation which logs the record * translog_write_record() gets a new argument 'hook_arg' which is passed down to pre|inwrite_rec_hook. It is more useful that 'parts' for those hooks, because when those hooks are called, 'parts' has possibly been mangled (like with LSN compression) and is so unpredictable. storage/maria/ma_open.c: fix for "empty body in if() statement" (when compiling without safemutex) storage/maria/ma_pagecache.c: <= is enough storage/maria/ma_recovery.c: * print the time that each recovery phase (REDO/UNDO/flush) took; this is enabled only when recovering from ha_maria. Is it printed n seconds with a fractional part of one digit (like 123.4 seconds). * In the REDO phase, update the table's live checksum by using the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END. Update it too when seeing REDO_DELETE_ALL. * In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does not have live checksum then reading the record's header (as done by the master loop of run_undo_phase()) is enough; otherwise we do a translog_read_record() to have the checksum delta ready for _ma_apply_undo_row_insert(). * When at the end of the REDO phase we notice that there is an unfinished group of REDOs, don't assert in debug binaries, as I verified that it can happen in real life (with kill -9) * removing ' in #error as it confuses gcc3 storage/maria/ma_rename.c: new prototype of translog_write_record() storage/maria/ma_test_recovery.expected: Change in output of ma_test_recovery: now all live checksums of original tables equal those of tables recreated by the REDO phase and those of tables fixed by the UNDO phase. I.e. recovery of the live checksum looks like working (which was after all the only goal of this changeset). I checked by hand that it's not just all live checksums which are now 0 and that's why they match. They are the old values like 3757530372. maria.test has hard-coded checksum values in its result file so checks this too. storage/maria/ma_update.c: * It's useless to put up HA_STATE_CHANGED in 'key_changed', as we put up HA_STATE_CHANGED in info->update anyway. * We need to compute the old and new rows' checksum before calling (*update_record)(), as checksum delta must be known when logging UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that some functions change the 'newrec' record (at least _ma_check_unique() does) so we cannot move the checksum computation too early in the function. storage/maria/ma_write.c: If inserting into a transactional table, live's checksum was already updated when writing UNDO_ROW_INSERT. The multiplication is a trick to save an if(). storage/maria/unittest/ma_test_loghandler-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_first_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_max_lsn-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multigroup-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_multithread-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_noflush-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_pagecache-t.c: new prototype of translog_write_record() storage/maria/unittest/ma_test_loghandler_purge-t.c: new prototype of translog_write_record() storage/myisam/sort.c: fix for compiler warnings in pushbuild (write_merge_key* functions didn't have their declaration match MARIA_HA::write_key). |
||
unknown
|
8b5dddbc00 |
WL#3072 Maria recovery
Progress reports on stderr if doing recovery from ha_maria; don't do checkpoints if activity since last checkpoint < 2MB (no change in fact as background thread is disabled for now); recovery trace is only if EXTRA_DEBUG now (better for benchmarks). storage/maria/ma_checkpoint.c: don't do checkpoints if activity (log writes plus page flushes) since last checkpoint was < 2MB. storage/maria/ma_recovery.c: progress reports in recovery (10%, transactions left to rollback etc); that is only if from ha_maria and is displayed on stderr. Recovery trace is now created only if EXTRA_DEBUG. storage/maria/ma_test_recovery.expected: update (--debug gone) storage/maria/ma_test_recovery: don't use --debug, as it can absent from binary |
||
unknown
|
cec8ac3e07 |
WL#3071 Maria checkpoint
Finally this is the real checkpoint code. It however exhibits unstabilities when a checkpoint runs concurrently with data-modifying clients (table corruption, transaction log's assertions) so for now a checkpoint is taken only at startup after recovery and at shutdown, i.e. not in concurrent situations. Later we will let it run periodically, as well as flush dirty pages periodically (almost all needed code is there already, only pagecache code is written but not committed). WL#3072 Maria recovery * replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via ma_test2 which has INSERTs failing with duplicate keys. * replaying of REDO_RENAME_TABLE Now, off to test Recovery in ha_maria :) BitKeeper/deleted/.del-ma_least_recently_dirtied.c: Delete: storage/maria/ma_least_recently_dirtied.c BitKeeper/deleted/.del-ma_least_recently_dirtied.h: Delete: storage/maria/ma_least_recently_dirtied.h storage/maria/Makefile.am: compile Checkpoint module storage/maria/ha_maria.cc: When ha_maria starts, do a recovery from last checkpoint. Take a checkpoint when that recovery has ended and when ha_maria shuts down cleanly. storage/maria/ma_blockrec.c: * even if my_sync() fails we have to my_close() (otherwise we leak a descriptor) * UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT, as promised in the old comment; it gives us skipping during the UNDO phase. storage/maria/ma_check.c: All REDOs before create_rename_lsn are ignored by Recovery. So create_rename_lsn must be set only after all data/index has been flushed and forced to disk. We thus move write_log_record_for_repair() to after _ma_flush_tables_files_after_repair(). storage/maria/ma_checkpoint.c: Checkpoint module. storage/maria/ma_checkpoint.h: optional argument if caller wants a thread to periodically take checkpoints and flush dirty pages. storage/maria/ma_create.c: * no need to init some vars as the initial bzero(share) takes care of this. * update to new function's name * even if we fail in my_sync() we have to my_close() storage/maria/ma_extra.c: Checkpoint reads share->last_version under intern_lock, so we make maria_extra() update it under intern_lock. THR_LOCK_maria still needed because of _ma_test_if_reopen(). storage/maria/ma_init.c: destroy checkpoint module when Maria shuts down. storage/maria/ma_loghandler.c: * UNDO_ROW_PURGE gone (see ma_blockrec.c) * we need to remember the LSN of the LOGREC_FILE_ID for a share, because this LSN is needed into the checkpoint record (Recovery wants to know the validity domain of an id->name mapping) * translog_get_horizon_no_lock() needed for Checkpoint * comment about failing assertion (Sanja knows) * translog_init_reader_data() thought that translog_read_record_header_scan() returns 0 in case of error, but 0 just means "0-length header". * translog_assign_id_to_share() now needs the MARIA_HA because LOGREC_FILE_ID uses a log-write hook. * Verify that (de)assignment of share->id happens only under intern_lock, as Checkpoint reads this id with intern_lock. * translog_purge() can accept TRANSLOG_ADDRESS, not necessarily a real LSN. storage/maria/ma_loghandler.h: prototype updates storage/maria/ma_open.c: no need to initialize "res" storage/maria/ma_pagecache.c: When taking a checkpoint, we don't need to know the maximum rec_lsn of dirty pages; this LSN was intended to be used in the two-checkpoint rule, but last_checkpoint_lsn is as good. 4 bytes for stored_list_size is enough as PAGECACHE::blocks (number of blocks which the pagecache can contain) is int. storage/maria/ma_pagecache.h: new prototype storage/maria/ma_recovery.c: * added replaying of REDO_RENAME_TABLE * UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END * Recovery from the last checkpoint record now possible * In new_table() we skip the table if the id->name mapping is older than create_rename_lsn (mapping dates from lsn_of_file_id). * in get_MARIA_HA_from_REDO_record() we skip the record if the id->name mapping is newer than the record (can happen if processing a record which is before the checkpoint record). * parse_checkpoint_record() has to return a LSN, that's what caller expects storage/maria/ma_rename.c: new function's name; log end zeroes of tables' names (ease recovery) storage/maria/ma_test2.c: * equivalent of ma_test1's --test-undo added (named -u here). * -t=1 now stops right after creating the table, so that we can test undoing of INSERTs with duplicate keys (which tests the CLR_END logged by _ma_write_abort_block_record()). storage/maria/ma_test_recovery.expected: Result of testing undoing of INSERTs with duplicate keys; there are some differences in maria_chk -dvv but they are normal (removing records does not shrink data/index file, does not put back the "analyzed, optimized keys"(etc) index state. storage/maria/ma_test_recovery: Test undoing of INSERTs with duplicate keys, using ma_test2; when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE, CLR_END; we abort after that, and test that CLR_END causes recovery to jump over UNDO_INSERT. storage/maria/ma_write.c: comment storage/maria/maria_chk.c: comment storage/maria/maria_def.h: * a new bit in MARIA_SHARE::in_checkpoint, used to build a list of unique shares during Checkpoint. * MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID for this share; needed to know to which LSN domain the mappings found in the Checkpoint record apply (new mappings should not apply to old REDOs). storage/maria/trnman.c: * small changes to how trnman_collect_transactions() fills its buffer; it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h |
||
unknown
|
0b2ba820c3 |
WL#3072 Maria recovery
* testing of execution of UNDO_ROW_UPDATE * when executing an UNDO_ROW_UPDATE, store "UNDO_ROW_UPDATE" as "type of undone record" into the CLR_END record. storage/maria/ma_blockrec.c: When logging a CLR_END in write_block_record(), it can be for a DELETE or for an UPDATE (now that Monty has coded execution of UNDO_UPDATE) storage/maria/ma_loghandler.c: UNDO_ROW_UPDATE's execution coded, so no crash storage/maria/ma_recovery.c: UNDO_ROW_UPDATE's execution now coded, so no crash storage/maria/ma_test1.c: upper case letter storage/maria/ma_test_recovery.expected: output of testing execution of UNDO_ROW_UPDATE. Table's checksum not recovered (known issue not specific to UPDATE). storage/maria/ma_test_recovery: Test execution of UNDO_ROW_UPDATE: first we stop ma_test1 after deletes and commit, then we stop ma_test1 after updates and abort; we verify that updates are rolled back by comparing tables |
||
unknown
|
6aef814d98 |
Fixed some bugs when using undo of VARCHAR fields
Fixed bug in undo_delete Fixed wrong error output from maria_check include/my_base.h: Added marker if we have null fields in table mysql-test/r/maria.result: checksum in maria now ignore null fields that are null sql/sql_table.cc: Ignore null fields that are now (Before enabling this, we have to change MyISAM to also skip null fields) storage/maria/ma_blockrec.c: More logging After merge fixes Fixed some bugs when using undo of VARCHAR fields Fixed bug in undo_delete (We can't use info->rec_buff here as this is used in write_block_record()) storage/maria/ma_blockrec.h: ma_recordpos_to_dir_entry changed to return uint storage/maria/ma_check.c: Fixed wrong output in case of errors storage/maria/ma_create.c: Set share.base.pack_reclength more correct for block record Delete support for RAID storage/maria/ma_open.c: Don't calculate checksum fields with value NULL storage/maria/ma_test1.c: Fixed output from -v for VARCHAR keys storage/maria/ma_test_recovery.expected: Update results after adding new printf New checksums (because we now ignore nulls) Some file lengths are different, but think they are ok (didn't have time to investigate) storage/myisam/ha_myisam.cc: Fixed comment storage/myisam/mi_test1.c: Fixed bug |
||
unknown
|
2291f932b2 |
- WL#3072 Maria Recovery:
Recovery of state.records (the count of records which is stored into the header of the index file). For that, state.is_of_lsn is introduced; logic is explained in ma_recovery.c (look for "Recovery of the state"). The net gain is that in case of crash, we now recover state.records, and it is idempotent (ma_test_recovery tests it). state.checksum is not recovered yet, mail sent for discussion. - WL#3071 Maria Checkpoint: preparation for it, by protecting all modifications of the state in memory or on disk with intern_lock (with the exception of the really-often-modified state.records, which is now protected with the log's lock, see ma_recovery.c (look for "Recovery of the state"). Also, if maria_close() sees that Checkpoint is looking at this table it will not my_free() the share. - don't compute row's checksum twice in case of UPDATE (correction to a bugfix I made yesterday). storage/maria/ha_maria.cc: protect state write with intern_lock (against Checkpoint) storage/maria/ma_blockrec.c: * don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it should wait until we have corrected the allocation in the bitmap (as the REDO can serve to correct the allocation during Recovery); introducing _ma_finalize_row() for that. * In a changeset yesterday I moved computation of the checksum into write_block_record(), to fix a bug in UPDATE. Now I notice that maria_update() already computes the checksum, it's just that it puts it into info->cur_row while _ma_update_block_record() uses info->new_row; so, removing the checksum computation from write_block_record(), putting it back into allocate_and_write_block_record() (which is called only by INSERT and UNDO_DELETE), and copying cur_row->checksum into new_row->checksum in _ma_update_block_record(). storage/maria/ma_check.c: new prototypes, they will take intern_lock when writing the state; also take intern_lock when changing share->kfile. In both cases this is to protect against Checkpoint reading/writing the state or reading kfile at the same time. Not updating create_rename_lsn directly at end of write_log_record_for_repair() as it wouldn't have intern_lock. storage/maria/ma_close.c: Checkpoint builds a list of shares (under THR_LOCK_maria), then it handles each such share (under intern_lock) (doing flushing etc); if maria_close() freed this share between the two, Checkpoint would see a bad pointer. To avoid this, when building the list Checkpoint marks each share, so that maria_close() knows it should not free it and Checkpoint will free it itself. Extending the zone covered by intern_lock to protect against Checkpoint reading kfile, writing state. storage/maria/ma_create.c: When we update create_rename_lsn, we also update is_of_lsn to the same value: it is logical, and allows us to test in maria_open() that the former is not bigger than the latter (the contrary is a sign of index header corruption, or severe logging bug which hinders Recovery, table needs a repair). _ma_update_create_rename_lsn_on_disk() also writes is_of_lsn; it now operates under intern_lock (protect against Checkpoint), a shortcut function is available for cases where acquiring intern_lock is not needed (table's creation or first open). storage/maria/ma_delete.c: if table is transactional, "records" is already decremented when logging UNDO_ROW_DELETE. storage/maria/ma_delete_all.c: comments storage/maria/ma_extra.c: Protect modifications of the state, in memory and/or on disk, with intern_lock, against a concurrent Checkpoint. When state goes to disk, update it's is_of_lsn (by calling the new _ma_state_info_write()). In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing a change I made a few days ago) and ASK_MONTY storage/maria/ma_locking.c: no real code change here. storage/maria/ma_loghandler.c: Log-write-hooks for updating "state.records" under log's mutex when writing/updating/deleting a row or deleting all rows. storage/maria/ma_loghandler_lsn.h: merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different) storage/maria/ma_open.c: When opening a table verify that is_of_lsn >= create_rename_lsn; if false the header must be corrupted. _ma_state_info_write() is split in two: _ma_state_info_write_sub() which is the old _ma_state_info_write(), and _ma_state_info_write() which additionally takes intern_lock if requested (to protect against Checkpoint) and updates is_of_lsn. _ma_open_keyfile() should change kfile.file under intern_lock to protect Checkpoint from reading a wrong kfile.file. storage/maria/ma_recovery.c: Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT which has a LSN > state.is_of_lsn it increments state.records. Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE. When closing a table during Recovery, we know its state is at least as new as the current log record we are looking at, so increase is_of_lsn to the LSN of the current log record. storage/maria/ma_rename.c: update for new behaviour of _ma_update_create_rename_lsn_on_disk(). storage/maria/ma_test1.c: update to new prototype storage/maria/ma_test2.c: update to new prototype (actually prototype was changed days ago, but compiler does not complain about the extra argument??) storage/maria/ma_test_recovery.expected: new result file of ma_test_recovery. Improvements: record count read from index's header is now always correct. storage/maria/ma_test_recovery: "rm" fails if file does not exist. Redirect stderr of script. storage/maria/ma_write.c: if table is transactional, "records" is already incremented when logging UNDO_ROW_INSERT. Comments. storage/maria/maria_chk.c: update is_of_lsn too storage/maria/maria_def.h: - MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored into the index file's header. - Checkpoint can now mark a table as "don't free this", and maria_close() can reply "ok then you will free it". - new functions storage/maria/maria_pack.c: update for new name |
||
unknown
|
d53991853e |
- speed optimization:
minimize writes to transactional Maria tables: don't write data pages, state, and open_count at the end of each statement. Data pages will be written by a background thread periodically. State will be written by Checkpoint periodically. open_count serves to detect when a table is potentially damaged due to an unclean mysqld stop, but thanks to recovery an unclean mysqld stop will be corrected and so open_count becomes useless. As state is written less often, it is often obsolete on disk, we thus should avoid to read it from disk. - by removing the data page writes above, it is necessary to put it back at the start of some statements like check, repair and delete_all. It was already necessary in fact (see ma_delete_all.c). - disabling CACHE INDEX on Maria tables for now (fixes crash of test 'key_cache' when run with --default-storage-engine=maria). - correcting some fishy code in maria_extra.c (we possibly could lose index pages when doing a DROP TABLE under Windows, in theory). storage/maria/ha_maria.cc: disable CACHE INDEX in Maria for now (there is a single cache for now), it crashes and it's not a priority storage/maria/ma_bitmap.c: debug message storage/maria/ma_check.c: The statement before maria_repair() may not flush state, so it needs to be done by maria_repair() (indeed this function uses maria_open(HA_OPEN_COPY) so reads state from disk, so needs to find it up-to-date on disk). For safety (but normally this is not needed) we remove index blocks out of the cache before repairing. _ma_flush_blocks() becomes _ma_flush_table_files_after_repair(): it now additionally flushes the data file and state and syncs files. As a side effect, the assertion "no WRITE_CACHE_USED" from _ma_flush_table_files() fired so we move all end_io_cache() done at the end of repair to before the calls to _ma_flush_table_files_after_repair(). storage/maria/ma_close.c: when closing a transactional table, we fsync it. But we need to do this only after writing its state. We need to write the state at close time only for transactional tables (the other tables do that at last unlock). Putting back the O_RDONLY||crashed condition which I had removed earlier. Unmap the file before syncing it (does not matter now as Maria does not use mmap) storage/maria/ma_delete_all.c: need to flush data pages before chsize-ing it. Was needed even when we flushed data pages at the end of each statement, because we didn't anyway do it if under LOCK TABLES: the change here thus fixes this bug: create table t(a int) engine=maria;lock tables t write; insert into t values(1);delete from t;unlock tables;check table t; "Size of datafile is: 16384 Should be: 8192" (an obsolete page went to disk after the chsize(), at unlock time). storage/maria/ma_extra.c: When doing share->last_version=0, we make the MARIA_SHARE-in-memory invisible to future openers, so need to have an up-to-date state on disk for them. The same way, future openers will reopen the data and index file, so they will not find our cached blocks, so we need to flush them to disk. In HA_EXTRA_FORCE_REOPEN, this probably happens naturally as all tables normally get closed, we however add a safety flush. In HA_EXTRA_PREPARE_FOR_RENAME, we need to do the flushing. On Windows we additionally need to close files. In HA_EXTRA_PREPARE_FOR_DROP, we don't need to flush anything but remove dirty cached blocks from memory. On Windows we need to close files. Closing files forces us to sync them before (requirement for transactional tables). For mutex reasons (don't lock intern_lock twice), we move maria_lock_database() and _ma_decrement_open_count() first in the list of operations. Flush also data file in HA_EXTRA_FLUSH. storage/maria/ma_locking.c: For transactional tables: - don't write data pages / state at unlock time; as a consequence, "share->changed=0" cannot be done. - don't write state in _ma_writeinfo() - don't maintain open_count on disk (Recovery corrects the table in case of crash anyway, and we gain speed by not writing open_count to disk), For non-transactional tables, flush the state at unlock only if the table was changed (optimization). Code which read the state from disk is relevant only with external locking, we disable it (if want to re-enable it, it shouldn't for transactional tables as state on disk may be obsolete (such tables does not flush state at unlock anymore). The comment "We have to flush the write cache" is now wrong because maria_lock_database(F_UNLCK) now happens before thr_unlock(), and we are not using external locking. storage/maria/ma_open.c: _ma_state_info_read() is only used in ma_open.c, making it static storage/maria/ma_recovery.c: set MARIA_SHARE::changed to TRUE when we are going to apply a REDO/UNDO, so that the state gets flushed at close. storage/maria/ma_test_recovery.expected: Changes introduced by this patch: - good: the "open" (table open, not properly closed) is gone, it was pointless for a recovered table - bad: stemming from different moments of writing the index's state probably (_ma_writeinfo() used to write the state after every row write in ma_test* programs, doesn't anymore as the table is transactional): some differences in indexes (not relevant as we don't yet have recovery for them); some differences in count of records (changed from a wrong value to another wrong value) (not relevant as we don't recover this count correctly yet anyway, though a patch will be pushed soon). storage/maria/ma_test_recovery: for repeatable output, no names of varying directories. storage/maria/maria_chk.c: function renamed storage/maria/maria_def.h: Function became local to ma_open.c. Function renamed. |
||
unknown
|
ac4ad9bdba |
WL#3072 Maria Recovery
misc fixes of execution of UNDOs in the UNDO phase: - into the CLR_END, store the LSN of the _previous_ UNDO (we debated what was best, so far we're going with "previous"; later we can change to "current" if needed), and store the type of record which is being undone (needed to know how to update state.records when we see the CLR_END during the REDO phase). - declaring all UNDOs and CLR_END as "compressed" - when executing an UNDO in the UNDO phase, state.records is updated as a hook when writing CLR_END (needed for "recovery of the state"), and so is trn->undo_lsn (needed for when we have checkpoints). - bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum into the re-inserted row, maria_chk -r thus threw the row away). - modifications of ma_test1: where to stop is now driven by --testflag; --test-undo just tells how to stop (flush data, flush log, nothing). - ma_test_recovery: testing of the UNDO phase, more testing of the REDO phase, identification of a bug. storage/maria/ma_blockrec.c: - bugfix: execution of UNDO_ROW_DELETE didn't store the correct checksum into the row (leading to "maria_chk -r" eliminating the re-inserted row, net effect was that rollback appeared to have rolled back no deletion). Reason was that write_block_record() used info->cur_row.checksum, while "row" can be != &info->cur_row (case of UNDO_ROW_DELETE). After fixing this, problems with _ma_update_block_record() appeared; indeed checksum was computed by allocate_and_write_block_record() while _ma_update_block_record() directly calls write_block_record(). Solution is to compute checksum in write_block_record() instead. - when executing an UNDO, we now pass the LSN of the _previous_ UNDO to block_format functions. This LSN can be 0 (if the being-executed UNDO was the transaction's first UNDO), so "undo_lsn==0" cannot work anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR instead (this is an impossible LSN). - store into CLR_END the type of log record which was undone (INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has to update state.records if it sees this CLR_END in the REDO phase. - when writing the CLR_END in _ma_apply_undo_row_insert(), the place to store file's id is log_data+LSN_STORE_SIZE. - in _ma_apply_undo_row_insert(), the records-- is moved to a hook when writing the CLR_END (this way it is under log's mutex which is needed for "recovery of the state") storage/maria/ma_loghandler.c: - all UNDOs, and CLR_END, start with the LSN of another UNDO; so we can declare them "compressed". - write_hook_for_clr_end() to set trn->undo_lsn (to the previous UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's lock), and also update, if appropriate, state.records. - reset share->id to 0 when deassigning; not useful for now but sounds logical. storage/maria/ma_recovery.c: - if no table is found for a REDO, it's not an error; for an UNDO, it is - in the REDO phase, when we see a CLR_END we must update trn->undo_lsn and sometimes state.records. - in the UNDO phase, when we execute an UNDO_ROW_INSERT: * update trn->undo_lsn only after executing the record * store the _previous_ undo_lsn into the CLR_END - at the end of the REDO phase, when we recreate TRN objects, they have already their long id in the log (either via a LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write a new, useless LOGREC_LONG_TRANSACTION_ID for them. storage/maria/ma_test1.c: * where to stop execution is now driven by --testflag and not --test-undo (ma_test2 already has --testflag for the same purpose). This allows us to do a clean stop (with commit) at any point. * --test-undo=# tells how to abort (flush all pages (which implies flushing log) or only log or nothing); all such "ways of crashing" are tested in ma_test_recovery storage/maria/ma_test_recovery: * Testing execution of UNDOs, with and without BLOBs. * Testing idempotency of REDOs. * See @todo for a probable bug with BLOBs. * maria_chk -rq instead of -r, as with -q it nicely stops on any problem in the data file (like the checksum bug see comment of ma_blockrec.c). * Testing if log was written by UNDO phase (often expected), not written by REDO phase (always expected). * Less output on the screen, compares with expected output in the end. * some shell thingies like "set --" and $# are courtesy of Danny and Pekka. storage/maria/maria_read_log.c: when only displaying the records, don't do an UNDO phase storage/maria/ma_test_recovery.expected: This is the expected output of a great part of ma_test_recovery. ma_test_recovery compares its output to the expected output and tells if different. If we look at this file it mentions differences in checksum (normal, it's not recovered yet) and in records count (getting a correct records' count when recovery starts on an already existing table, like when testing rollback, is coded but not yet pushed). |