Fixed bugs in undo logging
Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read)
Reserved place for reference-transid on key pages (for packing of transids)
ALTER TABLE and INSERT ... SELECT now uses fast creation of index
Known bugs:
ma_test_recovery fails because of a bug in redo handling when log is cut directly after a redo (Guilhem knows how to fix)
ma_test_recovery.excepted is not totally correct, because of the above bug
mysqld sometimes fails to restart; Fails with error "end_of_redo_phase: Assertion `long_trid != 0' failed"; Guilhem to investigate
include/maria.h:
Prototype changes
Added current_filepos to st_maria_sort_info
mysql-test/r/maria.result:
Updated results that changes as alter table and insert ... select now uses fast creation of index
mysys/mf_iocache.c:
Reset variable to gurard against double invocation
storage/maria/ma_bitmap.c:
Added _ma_bitmap_reset_cache() (needed for repair)
storage/maria/ma_blockrec.c:
Simplify code
More initial allocations
Fixed bug where head block was split before min_row_length (caused Maria to believe row was crashed on read)
storage/maria/ma_blockrec.h:
Moved TRANSID_SIZE to maria_def.h
Added prototype for new functions
storage/maria/ma_check.c:
Simplicy code
Fixed repair_by_sort to work with BLOCK_RECORD
- When using BLOCK_RECORD or UNPACK create new Maria handle
- Use common initializer function
- Align code with maria_repair()
Made some changes to maria_repair_parallel() to use common initializer function
Removed ASK_MONTY section by fixing noted problem
storage/maria/ma_close.c:
Moved check for readonly to _ma_state_info_write()
storage/maria/ma_key_recover.c:
Use different log entries if key root changes or not.
This fixed some bugs when tree grows
storage/maria/ma_key_recover.h:
Added keynr to st_msg_to_write_hook_for_undo_key
storage/maria/ma_loghandler.c:
Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT
storage/maria/ma_loghandler.h:
Added INIT_LOGREC_UNDO_KEY_INSERT_WITH_ROOT
storage/maria/ma_open.c:
Added TRANSID to all key pages (for future compressing of trans id's)
For compressed records, alloc a bit bigger buffer to avoid valgrind warnings
If table is opened readonly, don't update state
storage/maria/ma_packrec.c:
Allocate bigger array for bit unpacking to avoid valgrind errors
storage/maria/ma_recovery.c:
Added UNDO_KEY_INSERT_WITH_ROOT & UNDO_KEY_DELETE_WITH_ROOT
storage/maria/ma_sort.c:
More logging
storage/maria/ma_test_all.sh:
More tests
storage/maria/ma_test_recovery.expected:
Update results
Note that this is not complete becasue of a bug in recovery
storage/maria/ma_test_recovery:
Removed recreation of index (not needed when we have redo for index pages)
storage/maria/maria_chk.c:
When using flag --read-only, don't update status for files
When using --unpack, don't use REPAIR_BY_SORT if other repair option is given
Enable repair_by_sort for BLOCK records
Removed not needed newline at start of --describe
storage/maria/maria_def.h:
Support for TRANSID_SIZE to key pages
storage/maria/maria_read_log.c:
renamed --only-display to --display-only
Changed bitmaps to be written before unpinning of pages in write_block_record()
Log handler now assumes we never call it for not transactional tables
Fixed bug in ma_test_all that caused it to fail
sql/unireg.cc:
Removed 'at', as this makes it hard to find valgrind errors from the debug log
storage/maria/ma_blockrec.c:
Changed bzero() of blocks to get rid of a (non critical) valgrind error
Changed bitmaps to be written before unpinning of pages in write_block_record()
fixed that we don't log tails if table isn't transactional
storage/maria/ma_key_recover.c:
Fixed wrong log_data[] that caused us to log uninitialized data
storage/maria/ma_loghandler.c:
Replaced not needed test with DBUG_ASSERT()
storage/maria/ma_test_all.sh:
Remove control file if block size changes
New extendable format for maria_log_control file
Fixed some compiler warnings
include/maria.h:
Added maria_disable_logging() and maria_enable_logging()
mysql-test/include/maria_verify_recovery.inc:
Updated tests now when key redo/undo works
mysql-test/r/maria-recovery.result:
Updated tests now when key redo/undo works
storage/maria/ma_blockrec.c:
Use unified CLR code
Added rec_lsn for full pages
Moved clr write hook to ma_key_recover.c
Changed REDO code to keep pages pinned until undo
Mark page_link's as changed
storage/maria/ma_blockrec.h:
Moved write_hook_for_clr_end() to ma_key_recover.c
storage/maria/ma_check.c:
Changed key check code to use PAGECACHE_READ_UNKNOWN_PAGE
Fixed wrong warning when checking files after maria_pack
When unpacking files, we have to use new keypos_to_recpos method
When doing repair, we can disregard index key file pages in page cache
storage/maria/ma_commit.c:
Added simple enable/disable logging functions
(Needed for recovery)
storage/maria/ma_control_file.c:
Make maria control file extendable without having to make it incompatible for older versions
storage/maria/ma_control_file.h:
New error messages
Added CONTROL_FILE_VERSION
storage/maria/ma_delete.c:
Added redo/undo for key pages
change_length -> changed_length to make things similar
More comments & more DBUG
storage/maria/ma_key_recover.c:
Unified CLR method
Moved here write_hook_for_clr_end() and common keypage log functions
Changed REDO to keep pages pinned until undo
Changed UNDO code to change key_root under log mutex
storage/maria/ma_key_recover.h:
New structures and functions
storage/maria/ma_loghandler.c:
Include needed files
storage/maria/ma_open.c:
Change maria_open() to use pread() instead of read()
storage/maria/ma_page.c:
Fixed bug in key_del handling
Clear pages if IDENTICAL_PAGES_AFTER_RECOVERY is defined
storage/maria/ma_pagecache.c:
Indentation and spelling fixes
More DBUG
Added helper function: pagecache_block_link_to_buffer()
storage/maria/ma_pagecache.h:
Added pagecache_block_link_to_buffer()
storage/maria/ma_recovery.c:
Fixed state.changed
Fixed that REDO keeps pages pinned until UNDO
Some bug fixes from previous commit
Fixes for UNDO/REDO of key pages
storage/maria/ma_search.c:
Fixed packing and storing of keys to provide more information to caller so
that we can do efficent REDO logging of the changes.
storage/maria/ma_test1.c:
Fixed bug with not initialized variable
storage/maria/ma_test2.c:
Removed not used code
storage/maria/ma_test_all.res:
Updated results
storage/maria/ma_test_all.sh:
Changed one test to test more
Removed timing tests as not relevant here
storage/maria/ma_test_recovery.expected:
Updated test result after redo/undo if key pages works
storage/maria/ma_test_recovery:
Updated test after redo/undo if key pages works
storage/maria/ma_write.c:
Moved some general log functions to ma_key_recover.c
Fixed some bugs in undo
Moved ma_log_split() to _ma_split_page()
Small changes in some function arguments to be able to do redo logging
storage/maria/maria_chk.c:
disable logging while doing repair table
storage/maria/maria_def.h:
New function prototypes
Move some structs and functions to ma_key_recover.c
storage/maria/unittest/ma_control_file-t.c:
Updated with patch from Sanja
NOTE: This is not complete and need to be updated to new control file format
storage/maria/unittest/ma_test_loghandler-t.c:
Fixed compiler warning
into mysql.com:/home/my/mysql-maria
storage/maria/ha_maria.cc:
Auto merged
storage/maria/ma_bitmap.c:
Auto merged
storage/maria/ma_checkpoint.c:
Auto merged
storage/maria/ma_close.c:
Auto merged
storage/maria/ma_loghandler.c:
Auto merged
storage/maria/ma_loghandler.h:
Auto merged
storage/maria/ma_open.c:
Auto merged
storage/maria/ma_pagecache.h:
Auto merged
storage/maria/ma_write.c:
Auto merged
storage/maria/maria_def.h:
Auto merged
storage/maria/unittest/ma_pagecache_single.c:
Auto merged
storage/maria/ma_blockrec.c:
Manual merge
storage/maria/ma_page.c:
Manual merge
storage/maria/ma_pagecache.c:
Manual merge
storage/maria/ma_preload.c:
Manual merge
storage/maria/ma_recovery.c:
Manual merge
Add _ma_unpin_all_pages() to all new UNDO redo_exec_hook's
Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion
For transactional tables, shift record number in keys up with 1 bit to have place to indicate if transid follows
Checksum for MyISAM now ignores NULL and not used part of VARCHAR
Renamed some variables that caused shadow compiler warnings
Moved extra() call when waiting for tables to not be used to after tables are removed from cache.
Fixed crashing bugs when using Maria TEMPORARY tables with TRUNCATE. Removed 'hack' code in sql directory to go around this bug.
pagecache_unlock_by_ulink() now has extra argument to say if page was changed.
Give error message if we fail to open control file
Mark page cache variables as not flushable
include/maria.h:
Made min page cache larger (needed for pinning key page)
Added key_nr to st_maria_keydef for faster keyinfo->keynr conversion
Added write_comp_flag to move some runtime code to maria_open()
include/my_base.h:
Added new error message to be used when handler initialization failed
include/my_global.h:
Renamed dummy to swap_dummy to avoid conflicts with local 'dummy' variables
include/my_handler.h:
Added const to some parameters
mysys/array.c:
More DBUG
mysys/my_error.c:
Fixed indentation
mysys/my_handler.c:
Added const to some parameters
Added missing error messages
sql/field.h:
Renamed variables to avoid variable shadowing
sql/handler.h:
Renamed parameter to avoid variable name conflict
sql/item.h:
Renamed variables to avoid variable shadowing
sql/log_event_old.h:
Renamed variables to avoid variable shadowing
sql/set_var.h:
Renamed variables to avoid variable shadowing
sql/sql_delete.cc:
Removed maria hack for temporary tables
Fixed indentation
sql/sql_table.cc:
Moved extra() call when waiting for tables to not be used to after tables are removed from cache.
This was needed to ensure we don't do a PREPARE_FOR_DROP or similar call while the table is still in use.
sql/table.cc:
Copy page_checksum from share
Removed Maria hack
storage/maria/Makefile.am:
Added new files
storage/maria/ha_maria.cc:
Renamed records -> record_count and info -> create_info to avoid variable name conflicts
Mark page cache variables as not flushable
storage/maria/ma_blockrec.c:
Moved _ma_unpin_all_pages() to ma_key_recover.c
Moved init of info->pinned_pages to ma_open.c
Moved _ma_finalize_row() to maria_key_recover.h
Renamed some variables to avoid variable name conflicts
Mark page_link.changed for blocks we change directly
Simplify handling of undo link when writing LOGREC_UNDO_ROW_INSERT (old code crashed when having redo for index)
storage/maria/ma_blockrec.h:
Removed extra empty line
storage/maria/ma_checkpoint.c:
Remove not needed trnman.h
storage/maria/ma_close.c:
Free pinned pages (which are now always allocated)
storage/maria/ma_control_file.c:
Give error message if we fail to open control file
storage/maria/ma_delete.c:
Changes for redo logging (first part, logging of underflow not yet done)
- Log undo-key-delete
- Log delete of key
- Updated arguments to _ma_fetch_keypage(), _ma_dispose(), _ma_write_keypage(), _ma_insert()
- Added new arguments to some functions to be able to write redo information
- Mark key pages as changed when we write with PAGECACHE_LOCK_LEFT_WRITELOCKED
Remove one not needed _ma_write_keypage() in d_search() when upper level will do the write anyway
Changed 2 bmove_upp() to bmove() as this made code easer to understand
More function comments
Indentation fixes
storage/maria/ma_ft_update.c:
New arguments to _ma_write_keypage()
storage/maria/ma_loghandler.c:
Fixed some DBUG_PRINT messages
Simplify code
Added new log entrys for key page redo
Renamed some variables to avoid variable name shadowing
storage/maria/ma_loghandler.h:
Moved some defines here
Added define for storing key number on key pages
Added new translog record types
Added enum for type of operations in LOGREC_REDO_INDEX
storage/maria/ma_open.c:
Always allocate info.pinned_pages (we need now also for normal key page usage)
Update keyinfo->key_nr
Added virtual functions to convert record position o number to be stored on key pages
Update keyinfo->write_comp_flag to value of search flag to be used when writing key
storage/maria/ma_page.c:
Added redo for key pages
- Extended _ma_fetch_keypage() with type of lock to put on page and address to used MARIA_PINNED_PAGE
- _ma_fetch_keypage() now pin's pages if needed
- Extended _ma_write_keypage() with type of locks to be used
- ma_dispose() now locks info->s->state.key_del from other threads
- ma_dispose() writes redo log record
- ma_new() locks info->s->state.key_del from other threads if it was used
- ma_new() now pins read page
Other things:
- Removed some not needed arguments from _ma_new() and _ma_dispose)
- Added some new variables to simplify code
- If EXTRA_DEBUG is used, do crc on full page to catch not unitialized bytes
storage/maria/ma_pagecache.h:
Applied patch from Sanja to add extra argument to pagecache_unlock_by_ulink() to mark if page was changed
Added some defines for pagecache priority levels that one can use
storage/maria/ma_range.c:
Added new arguments for call to _ma_fetch_keypage()
storage/maria/ma_recovery.c:
- Added hooks for new translog types:
REDO_INDEX, REDO_INDEX_NEW_PAGE, REDO_INDEX_FREE_PAGE, UNDO_KEY_INSERT, UNDO_KEY_DELETE and
UNDO_KEY_DELETE_WITH_ROOT.
- Moved variable declarations to start of function (portability fixes)
- Removed some not needed initializations
- Set only relevant state changes for each redo/undo entry
storage/maria/lockman.c:
Removed end space
storage/maria/ma_check.c:
Removed end space
storage/maria/ma_create.c:
Removed end space
storage/maria/ma_locking.c:
Removed end space
storage/maria/ma_packrec.c:
Removed end space
storage/maria/ma_pagecache.c:
Removed end space
storage/maria/ma_panic.c:
Removed end space
storage/maria/ma_rt_index.c:
Added new arguments for call to _ma_fetch_keypage(), _ma_write_keypage(), _ma_dispose() and _ma_new()
Fixed indentation
storage/maria/ma_rt_key.c:
Added new arguments for call to _ma_fetch_keypage()
storage/maria/ma_rt_split.c:
Added new arguments for call to _ma_new()
Use new keypage header
Added new arguments for call to _ma_write_keypage()
storage/maria/ma_search.c:
Updated comments & indentation
Added new arguments for call to _ma_fetch_keypage()
Made some variables and arguments const
Added virtual functions for converting row position to number to be stored in key
use MARIA_RECORD_POS of record position instead of my_off_t
Record in MARIA_KEY_PARAM how page was changed one key insert (needed for REDO)
storage/maria/ma_sort.c:
Removed end space
storage/maria/ma_statrec.c:
Updated arguments for call to _ma_rec_pos()
storage/maria/ma_test1.c:
Fixed too small buffer to init_pagecache()
Fixed bug when using insert_count and test_flag
storage/maria/ma_test2.c:
Use more resonable pagecache size
Remove not used code
Reset blob_length to fix wrong output message
storage/maria/ma_test_all.sh:
Fixed wrong test
storage/maria/ma_write.c:
Lots of new code to handle REDO of key pages
No logic changes because of REDO code, mostly adding new arguments and adding new code for logging
Added new arguments for calls to _ma_fetch_keypage(), _ma_write_keypage() and similar functions
Move setting of comp_flag in ma_ck_wrte_btree() from runtime to maria_open()
Zerofill new used pages for:
- To remove possible sensitive data left in buffer
- To get idenitical data on pages after running redo
- Better compression of pages if archived
storage/maria/maria_chk.c:
Added information if table is crash safe
storage/maria/maria_def.h:
New virtual function to convert between record position on key and normal record position
Aded mutex and extra variables to handle locking of share->state.key_del
Moved some structure variables to get things more aligned
Added extra arguments to MARIA_KEY_PARAM to be able to remember what was changed on key page on key insert
Added argument to MARIA_PINNED_PAGE to indicate if page was changed
Updated prototypes for functions
Added some structures for signaling changes in REDO handling
storage/maria/unittest/ma_pagecache_single.c:
Updated arguments for changed function calls
storage/myisam/mi_check.c:
Made calc_check_checksum virtual
storage/myisam/mi_checksum.c:
Update checksums to ignore null columns
storage/myisam/mi_create.c:
Mark if table has null column (to know when we have to use mi_checksum())
storage/myisam/mi_open.c:
Added virtual function for calculating checksum to be able to easily ignore NULL fields
storage/myisam/mi_test2.c:
Fixed bug
storage/myisam/myisamdef.h:
Added virtual function for calculating checksum during check table
Removed ha_key_cmp() as this is in handler.h
storage/maria/ma_key_recover.c:
New BitKeeper file ``storage/maria/ma_key_recover.c''
storage/maria/ma_key_recover.h:
New BitKeeper file ``storage/maria/ma_key_recover.h''
storage/maria/ma_key_redo.c:
New BitKeeper file ``storage/maria/ma_key_redo.c''
maria_read_log used to always print a warning message at startup
to say it is unsafe if ALTER TABLE was used. Now it prints it only
if the log does show the problem (=ALTER TABLE or CREATE SELECT, which
both disable logging of REDO_INSERT*).
For that, when ha_maria::external_lock() disables transactionality
it writes a LOGREC_INCOMPLETE_LOG to the log, which "maria_read_log -a"
picks up to write a warning.
REPAIR TABLE also disables those REDO_INSERT* but as maria_read_log
executes LOGREC_REDO_REPAIR no warning is needed.
storage/maria/ha_maria.cc:
as we now log a record when disabling transactionility, we need the
TRN to be set up first
storage/maria/ma_blockrec.c:
comment
storage/maria/ma_loghandler.c:
new type of log record
storage/maria/ma_loghandler.h:
new type of log record
storage/maria/ma_recovery.c:
* maria_apply_log() now returns a count of warnings. What currently
produces warnings is:
- skipping applying UNDOs though there are some (=> inconsistent table)
- replaying log (in maria_read_log) though the log contains some
ALTER TABLE or CREATE SELECT (log misses REDO_INSERT* for those
and is so incomplete).
Count of warnings affects the final message of maria_read_log and
recovery (though in recovery none of the two conditions above should
happen).
* maria_read_log used to always print a warning message at startup
to say it is unsafe if ALTER TABLE was used. Now it prints it only
if the log does show the problem, i.e. ALTER TABLE or CREATE SELECT
was used (both disable logging of REDO_INSERT* as those records are
not needed for recovery; those missing records in turn make
recreation-from-scratch, via maria_read_log, impossible). For that,
when ha_maria::external_lock() disables transactionality,
_ma_tmp_disable_logging_for_table() writes a LOGREC_INCOMPLETE_LOG to
the log, which maria_apply_log() picks up to write a warning.
storage/maria/ma_recovery.h:
maria_apply_log() returns a count of warnings
storage/maria/maria_def.h:
_ma_tmp_disable_logging_for_table() grows so becomes a function
storage/maria/maria_read_log.c:
maria_apply_log can now return a count of warnings, to temper the
"SUCCESS" message printed in the end by maria_read_log.
Advise users to make a backup first.
storage/maria/ma_loghandler.c:
Infinite loop in case of given address more then last LSN fixed
(now it means just flush all log).
Removed unneeded ASSERT.
storage/maria/unittest/ma_test_loghandler-t.c:
The test case of flushing all log added.
into mysql.com:/home/my/mysql-maria
include/my_sys.h:
Auto merged
mysql-test/r/maria.result:
Auto merged
mysql-test/t/maria.test:
Auto merged
sql/handler.h:
Auto merged
sql/mysqld.cc:
Auto merged
storage/maria/ha_maria.cc:
Auto merged
storage/maria/ma_bitmap.c:
Auto merged
storage/maria/ma_blockrec.c:
Auto merged
storage/maria/ma_loghandler.c:
Auto merged
storage/maria/ma_pagecache.c:
Auto merged
storage/maria/ma_test1.c:
Auto merged
storage/maria/ma_test_recovery.expected:
Auto merged
storage/maria/ma_test_recovery:
Auto merged
sql/mysql_priv.h:
manual merge
storage/maria/ma_recovery.c:
manual merge
storage/maria/ma_test2.c:
manual merge
Changed format for REDO_INSERT_ROWS_BLOBS
Fixed several bugs in handling of big blobs
Added redo_free_head_or_tail() & redo_insert_row_blobs()
Added uuid to control file
maria_checks now verifies that not used part of bitmap is 0
REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS
Added REDO_FREE_HEAD_OR_TAIL
Fixes problem when trying to read block outside of file during REDO
include/my_global.h:
STACK_DIRECTION is already set by configure
mysql-test/r/maria.result:
Updated results
mysql-test/t/maria.test:
Test shrinking of VARCHAR
mysys/my_realloc.c:
Fixed indentation
mysys/safemalloc.c:
Fixed indentation
sql/filesort.cc:
Removed some casts
sql/mysqld.cc:
Added missing setting of myisam_stats_method_str
sql/uniques.cc:
Removed some casts
storage/maria/ma_bitmap.c:
Added printing of bitmap (for debugging)
Renamed _ma_print_bitmap() -> _ma_print_bitmap_changes()
Added _ma_set_full_page_bits()
Fixed bug in ma_bitmap_find_new_place() (affecting updates) when using big files
storage/maria/ma_blockrec.c:
Changed format for REDO_INSERT_ROWS_BLOBS
Fixed several bugs in handling of big blobs
Added code to fix some cases where redo when using blobs didn't produce idenital .MAD files as normal usage
REDO_FREE_ROW_BLOCKS doesn't anymore change pages; We only mark things free in bitmap
Remove TAIL and filler extents from REDO_FREE_BLOCKS log entry. (Fixed some asserts)
REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS
Delete tails in update. (Fixed bug when doing update that shrinks blob/varchar length)
Fixed bug when doing insert in block outside of file size.
Added redo_free_head_or_tail() & redo_insert_row_blobs()
Added pagecache_unlock_by_link() when read fails.
Much more comments, DBUG and ASSERT entries
storage/maria/ma_blockrec.h:
Prototypes of new functions
Define of SUB_RANGE_SIZE & BLOCK_FILLER_SIZE
storage/maria/ma_check.c:
Verify that not used part of bitmap is 0
storage/maria/ma_control_file.c:
Added uuid to control file
storage/maria/ma_loghandler.c:
REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS
Added REDO_FREE_HEAD_OR_TAIL
storage/maria/ma_loghandler.h:
REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS
Added REDO_FREE_HEAD_OR_TAIL
storage/maria/ma_pagecache.c:
If we write full block, remove error flag for block.
(Fixes problem when trying to read block outside of file)
storage/maria/ma_recovery.c:
REDO_PURGE_BLOCKS -> REDO_FREE_BLOCKS
Added REDO_FREE_HEAD_OR_TAIL
storage/maria/ma_test1.c:
Allow option after 'b' to be compatible with ma_test2
(This is just to simplify test scripts like ma_test_recovery)
storage/maria/ma_test2.c:
Default size of blob is now 1000 instead of 1
storage/maria/ma_test_all.sh:
Added test for bigger blobs
storage/maria/ma_test_recovery.expected:
Updated results
storage/maria/ma_test_recovery:
Added test for bigger blobs
on the loghandler start.
Variable definition moved because it is C programm.
storage/maria/ma_loghandler.c:
Checking that the very last record is fully written
on the loghandler start.
storage/maria/ma_recovery.c:
Variable definition moved because it is C programm.
- fixes (in recovery, checkpoint, log handler) of bugs found
during testing.
- new option --check for maria_read_log: with --only-display (which only
reads the header) it reads the full record, for debugging.
storage/maria/ma_loghandler.c:
importing patch from Sanja for bug of translog_next_LSN() found
during recovery
storage/maria/ma_loghandler_lsn.h:
better types (0L is 4 bytes on some platforms, it causes problems
when used into lsn_store(): right shift >= width of type.
storage/maria/ma_pagecache.c:
work around infamous "PAGECACHE_PLAIN_PAGE used for transactional
tables in specialm case"; REDO phase disables logging and this causes
pages to be PAGECACHE_PLAIN_PAGE, thus ignored wrongly by the
checkpoint taken at the end of the REDO phase.
storage/maria/ma_recovery.c:
- a #ifdef which broke maria_read_log in non-debug builds (no output!)
- support for maria_read_log --check
- detect record corruption before opening the table
- updating is_of_horizon requires writing the state
- fix for wrong parsing of checkpoint record by recovery
storage/maria/ma_recovery.h:
support for maria_read_log --check
storage/maria/maria_read_log.c:
Option --check: --only-display only looks at the header;
adding --check tries a translog_read_record() to see if record can
be fully read (this is to find bugs).
Added ability top change empty space filler of the loghandler.
Fixed end of log reaction.
Fixed memory corruprion bug caused by reading non-filled hage header.
Added debug output.
storage/maria/ma_bitmap.c:
Compiler warnings removed.
storage/maria/ma_blockrec.c:
Compiler warnings removed.
storage/maria/ma_loghandler.c:
Added ability top change empty space filler of the loghandler.
Fixed end of log reaction.
Fixed memory corruprion bug caused by reading non-filled hage header.
Added debug output.
storage/maria/ma_loghandler_lsn.h:
Compiler warnings removed.
moved debugging defines.
Fixed buffer flushing (we copied last page
content before it was actually written, now
we abvance pointer in new buffer and unlock
it while waiting for filling old buffer)
Misc changes:
- fix for benign Valgrind error, compiler warnings
- fix for a segfault in execution of maria_delete_all_rows() and one
when taking multiple checkpoints
- fix for too paranoid assertion
- adding ability to take checkpoints at the end of the REDO phase
and at the end of recovery.
- other minor changes
storage/maria/ha_maria.cc:
The checkpoint done after Recovery is finished, is moved to
maria_recover().
storage/maria/ma_bitmap.c:
fix for Valgrind error: the "shadow debug copy" of the bitmap page
started unitialized and so ma_print_bitmap() would use it uninitialized
storage/maria/ma_checkpoint.c:
* reset pointers to NULL after freeing them, or we segfault at
next checkpoint in my_realloc().
* fix for compiler warnings.
storage/maria/ma_delete_all.c:
info->trn is NULL for non-transactional tables
storage/maria/ma_locking.c:
correct assertion (it fired wrongly in execution of REDO_DROP_TABLE
due to the maria_extra(HA_PREPARE_FOR_DROP)->_ma_decrement_open_count()
->maria_lock_database(F_UNLCK); another solution would have been to
not call _ma_decrement_open_count() (it's ok to have a wrong open
count in a table which we are dropping), but the same problem
would still exist for REDO_RENAME_TABLE.
storage/maria/ma_loghandler.c:
fail early if UNRECOVERABLE_ERROR
storage/maria/ma_recovery.c:
* new argument to maria_apply_log(): should it take checkpoints
(at end of REDO phase and at the very end) or no.
* moving the call to translog_next_LSN() into
parse_checkpoint_record() ("hide the details").
* Refining an error detection for something which could happen
if there is a checkpoint record in the log.
* Using close_one_table() instead of maria_extra(HA_EXTRA_PREPARE_FOR_DROP|RENAME),
as it looks safer, and also changing how close_one_table() works:
it now limits itself to scanning all_tables[], thus having one loopp
instead of two, which should be faster (as a result, it does not
close tables not registered in this array, which is ok as there
should not be any).
storage/maria/ma_recovery.h:
new parameter
storage/maria/maria_read_log.c:
update to new prototype
Debug output information fixed.
Fixed direct page referencing in write mode.
storage/maria/ma_loghandler.c:
TODO added.
storage/maria/ma_pagecache.c:
Debug output information fixed.
Fixed direct page referencing in write mode.
* Recovery of the table's live checksum (CREATE TABLE ... CHECKSUM=1)
is achieved in this patch. The table's live checksum
(info->s->state.state.checksum) is updated in inwrite_rec_hook's
under the log mutex when writing UNDO_ROW_INSERT|UPDATE|DELETE
and REDO_DELETE_ALL. The checksum variation caused by the operation
is stored in these UNDOs, so that the REDO phase, when it sees such
UNDOs, can update the live checksum if it is older (state.is_of_lsn is
lower) than the record. It is also used, as a nice add-on with no
cost, to do less row checksum computation during the UNDO phase
(as we have it in the record already).
Doing this work, it became pressing to move in-write hooks
(write_hook_for_redo() et al) to ma_blockrec.c.
The 'parts' argument of inwrite_rec_hook is unpredictable (it comes
mangled at this stage, for example by LSN compression) so it is
replaced by a 'void* hook_arg', which is used to pass down information,
currently only to write_hook_for_clr_end() (previous undo_lsn and
type of undone record).
* If from ha_maria, we print to stderr how many seconds (with one
fractional digit) the REDO phase took, same for UNDO phase and for
final table close. Just to give an indication for debugging and maybe
also for Support.
storage/maria/ha_maria.cc:
question for Monty
storage/maria/ma_blockrec.c:
* log in-write hooks (write_hook_for_redo() etc) move from
ma_loghandler.c to here; this is natural: the hooks are coupled
to their callers (functions in ma_blockrec.c).
* translog_write_record() now has a new argument "hook_arg";
using it to pass down to write_hook_for_clr_end() the transaction's
previous_undo_lsn and the type of the being undone record, and also
to pass down to all UNDOs the live checksum variation caused by the
operation.
* If table has live checksum, store in UNDO_ROW_INSERT|UPDATE|DELETE
and in CLR_END the checksum variation ("delta") caused by the
operation. For example if a DELETE caused the table's live checksum
to change from 123 to 456, we store in the UNDO_ROW_DELETE, in 4 bytes,
the value 333 (456-123).
* Instead of hard-coded "1" as length of the place where we store
the undone record's type in CLR_END, use a symbol CLR_TYPE_STORE_SIZE;
use macros clr_type_store and clr_type_korr.
* write_block_record() has a new parameter 'old_record_checksum'
which is the pre-computed checksum of old_record; that value is used
to update the table's live checksum when writing UNDO_ROW_UPDATE|CLR_END.
* In allocate_write_block_record(), if we are executing UNDO_ROW_DELETE
the row's checksum is already computed.
* _ma_update_block_record2() now expect the new row's checksum into
cur_row.checksum (was already true) and the old row's checksum into
new_row.checksum (that's new). Its two callers, maria_update() and
_ma_apply_undo_row_update(), honour this.
* When executing an UNDO_ROW_INSERT|UPDATE|DELETE in UNDO phase, pick
up the checksum delta from the log record. It is then used to update
the table's live checksum when writing CLR_END, and saves us a
computation of record.
storage/maria/ma_blockrec.h:
in-write hooks move from ma_loghandler.c
storage/maria/ma_check.c:
more straightforward size of buffer
storage/maria/ma_checkpoint.c:
<= is enough
storage/maria/ma_commit.c:
new prototype of translog_write_record()
storage/maria/ma_create.c:
new prototype of translog_write_record()
storage/maria/ma_delete.c:
The row's checksum must be computed before calling(*delete_record)(),
not after, because it must be known inside _ma_delete_block_record()
(to update the table's live checksum when writing UNDO_ROW_DELETE).
If deleting from a transactional table, live checksum was already updated
when writing UNDO_ROW_DELETE.
storage/maria/ma_delete_all.c:
@todo is now done (in ma_loghandler.c)
storage/maria/ma_delete_table.c:
new prototype of translog_write_record()
storage/maria/ma_loghandler.c:
* in-write hooks move to ma_blockrec.c.
* translog_write_record() gets a new argument 'hook_arg' which is
passed down to pre|inwrite_rec_hook. It is more useful that 'parts'
for those hooks, because when those hooks are called, 'parts' has
possibly been mangled (like with LSN compression) and is so
unpredictable.
* fix for compiler warning (unused buffer_start when compiling without
debug support)
* Because checksum delta is stored into UNDO_ROW_INSERT|UPDATE|DELETE
and CLR_END, but only if the table has live checksum, these records
are not PSEUDOFIXEDLENGTH anymore, they are now VARIABLE_LENGTH (their
length is X if no live checksum and X+4 otherwise).
* add an inwrite_rec_hook for UNDO_ROW_UPDATE, which updates the
table's live checksum. Update it also in hooks of UNDO_ROW_INSERT|
DELETE and REDO_DELETE_ALL and CLR_END.
* Bugfix: when reading a record in translog_read_record(), it happened
that "length" became negative, because the function assumed that
the record extended beyond the page's end, whereas it may be shorter.
storage/maria/ma_loghandler.h:
* Instead of hard-coded "1" and "4", use symbols and macros
to store/retrieve the type of record which the CLR_END corresponds
to, and the checksum variation caused by the operation which logs the
record
* translog_write_record() gets a new argument 'hook_arg' which is
passed down to pre|inwrite_rec_hook. It is more useful that 'parts'
for those hooks, because when those hooks are called, 'parts' has
possibly been mangled (like with LSN compression) and is so
unpredictable.
storage/maria/ma_open.c:
fix for "empty body in if() statement" (when compiling without safemutex)
storage/maria/ma_pagecache.c:
<= is enough
storage/maria/ma_recovery.c:
* print the time that each recovery phase (REDO/UNDO/flush) took;
this is enabled only when recovering from ha_maria. Is it printed
n seconds with a fractional part of one digit (like 123.4 seconds).
* In the REDO phase, update the table's live checksum by using
the checksum delta stored in UNDO_ROW_INSERT|DELETE|UPDATE and CLR_END.
Update it too when seeing REDO_DELETE_ALL.
* In the UNDO phase, when executing UNDO_ROW_INSERT, if the table does
not have live checksum then reading the record's header (as done by
the master loop of run_undo_phase()) is enough; otherwise we
do a translog_read_record() to have the checksum delta ready
for _ma_apply_undo_row_insert().
* When at the end of the REDO phase we notice that there is an unfinished
group of REDOs, don't assert in debug binaries, as I verified that it
can happen in real life (with kill -9)
* removing ' in #error as it confuses gcc3
storage/maria/ma_rename.c:
new prototype of translog_write_record()
storage/maria/ma_test_recovery.expected:
Change in output of ma_test_recovery: now all live checksums of
original tables equal those of tables recreated by the REDO phase
and those of tables fixed by the UNDO phase. I.e. recovery of
the live checksum looks like working (which was after all the only
goal of this changeset).
I checked by hand that it's not just all live checksums which are
now 0 and that's why they match. They are the old values like
3757530372. maria.test has hard-coded checksum values in its result
file so checks this too.
storage/maria/ma_update.c:
* It's useless to put up HA_STATE_CHANGED in 'key_changed',
as we put up HA_STATE_CHANGED in info->update anyway.
* We need to compute the old and new rows' checksum before calling
(*update_record)(), as checksum delta must be known when logging
UNDO_ROW_UPDATE which is done by _ma_update_block_record(). Note that
some functions change the 'newrec' record (at least _ma_check_unique()
does) so we cannot move the checksum computation too early in the
function.
storage/maria/ma_write.c:
If inserting into a transactional table, live's checksum was
already updated when writing UNDO_ROW_INSERT. The multiplication
is a trick to save an if().
storage/maria/unittest/ma_test_loghandler-t.c:
new prototype of translog_write_record()
storage/maria/unittest/ma_test_loghandler_first_lsn-t.c:
new prototype of translog_write_record()
storage/maria/unittest/ma_test_loghandler_max_lsn-t.c:
new prototype of translog_write_record()
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
new prototype of translog_write_record()
storage/maria/unittest/ma_test_loghandler_multithread-t.c:
new prototype of translog_write_record()
storage/maria/unittest/ma_test_loghandler_noflush-t.c:
new prototype of translog_write_record()
storage/maria/unittest/ma_test_loghandler_pagecache-t.c:
new prototype of translog_write_record()
storage/maria/unittest/ma_test_loghandler_purge-t.c:
new prototype of translog_write_record()
storage/myisam/sort.c:
fix for compiler warnings in pushbuild (write_merge_key* functions
didn't have their declaration match MARIA_HA::write_key).
gprof build for amd64.
storage/maria/ma_loghandler.c:
fater way to detect last page address for the last log file.
BUILD/compile-amd64-gprof-no-ndb:
New BitKeeper file ``BUILD/compile-amd64-gprof-no-ndb''
REDO optimization (Bascily avoid moving blocks from/to pagecache)
More command line arguments to maria_read_log
Fixed recovery bug when recreating table
sql/opt_range.cc:
Remove SAFE_MODE for opt_range as it disables UPDATE to use keys
storage/maria/ma_blockrec.c:
REDO optimization
Use new interface for pagecache_reads to avoid copying page buffers
storage/maria/ma_loghandler.c:
Patch from Sanja:
- Added new parameter to translog_get_page to use direct links to pagecache
- Changed scanner to be able to use direct links
This avoids a lot of calls to bmove512() in page cache.
storage/maria/ma_loghandler.h:
Added direct link to pagecache objects
storage/maria/ma_open.c:
Added const to parameter
Added missing braces
storage/maria/ma_pagecache.c:
From Sanja:
- Added direct links to pagecache (from pagecache_read())
Dirrect link means that on pagecache_read we get back a pointer to the pagecache buffer
From Monty:
- Fixed arguments to init_page_cache to handle big page caches
- Fixed compiler warnings
- Replaced PAGECACHE_PAGE_LINK with PAGECACHE_BLOCK_LINK * to catch errors
storage/maria/ma_pagecache.h:
Changed block numbers from int to long to be able to handle big page caches
Changed some PAGECACHE_PAGE_LINK to PAGECACHE_BLOCK_LINK
storage/maria/ma_recovery.c:
Fixed recovery bug when recreating table (table was kept open)
Moved some variables to function start (portability)
Added space to some print messages
storage/maria/maria_chk.c:
key_buffer_size -> page_buffer_size
storage/maria/maria_def.h:
Changed default page_buffer_size to 10M
storage/maria/maria_read_log.c:
Added more startup options:
--version
--undo (apply undo)
--page_cache_size (to run with big cache sizes)
--silent (to not get any output from --apply)
storage/maria/unittest/ma_control_file-t.c:
Fixed compiler warning
storage/maria/unittest/ma_test_loghandler-t.c:
Added new argument to translog_init_scanner()
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
Added new argument to translog_init_scanner()
storage/maria/unittest/ma_test_loghandler_multithread-t.c:
Added new argument to translog_init_scanner()
record (for example in ma_test_loghandler-t). Reason was
wrongly matched () in DBUG.
storage/maria/ma_loghandler.c:
Wrongly matched parenthesis:
DBUG_PRINT(keyword, argslist) expands to roughly
_db_doprnt arglist;
So DBUG_PRINT("enter",(a);
b;
c);
expands to roughly
_db_doprnt(a);b;c;
which is valid code. Except that in non-debug builds, DBUG_PRINT(
expands to nothing so the wrongly "included" code is thrown away,
leading to some members of "scanner" to not be initialized.
horizon addres.
storage/maria/ma_loghandler.c:
New function to get correct LSN from chunk address.
storage/maria/ma_loghandler.h:
New function to get correct LSN from chunk address.
storage/maria/ma_checkpoint.c:
The new macro for easier printing LSN added.
storage/maria/ma_loghandler.c:
The assertion returned.
The new macro for easier printing LSN added.
storage/maria/ma_loghandler_lsn.h:
The new macro for easier printing LSN added.
storage/maria/ma_pagecache.c:
The new macro for easier printing LSN added.
storage/maria/ma_recovery.c:
Recovery checks empty log state.
RECHEADER_READ_ERROR means some real error.
storage/maria/maria_read_log.c:
Read log starts from real beggining of the log and precess
error and empty log states.
The new macro for easier printing LSN added.
storage/maria/unittest/ma_test_loghandler-t.c:
The new macro for easier printing LSN added.
storage/maria/unittest/ma_test_loghandler_first_lsn-t.c:
The new macro for easier printing LSN added.
storage/maria/unittest/ma_test_loghandler_max_lsn-t.c:
The new macro for easier printing LSN added.
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
The new macro for easier printing LSN added.
storage/maria/unittest/ma_test_loghandler_multithread-t.c:
The new macro for easier printing LSN added.
storage/maria/unittest/ma_test_loghandler_noflush-t.c:
The new macro for easier printing LSN added.
Finally this is the real checkpoint code.
It however exhibits unstabilities when a checkpoint runs concurrently
with data-modifying clients (table corruption, transaction log's
assertions) so for now a checkpoint is taken only at startup after
recovery and at shutdown, i.e. not in concurrent situations. Later
we will let it run periodically, as well as flush dirty pages
periodically (almost all needed code is there already, only pagecache
code is written but not committed).
WL#3072 Maria recovery
* replacing UNDO_ROW_PURGE with CLR_END; testing of those CLR_END via
ma_test2 which has INSERTs failing with duplicate keys.
* replaying of REDO_RENAME_TABLE
Now, off to test Recovery in ha_maria :)
BitKeeper/deleted/.del-ma_least_recently_dirtied.c:
Delete: storage/maria/ma_least_recently_dirtied.c
BitKeeper/deleted/.del-ma_least_recently_dirtied.h:
Delete: storage/maria/ma_least_recently_dirtied.h
storage/maria/Makefile.am:
compile Checkpoint module
storage/maria/ha_maria.cc:
When ha_maria starts, do a recovery from last checkpoint.
Take a checkpoint when that recovery has ended and when ha_maria
shuts down cleanly.
storage/maria/ma_blockrec.c:
* even if my_sync() fails we have to my_close() (otherwise we leak
a descriptor)
* UNDO_ROW_PURGE is replaced by a simple CLR_END for UNDO_ROW_INSERT,
as promised in the old comment; it gives us skipping during the
UNDO phase.
storage/maria/ma_check.c:
All REDOs before create_rename_lsn are ignored by Recovery. So
create_rename_lsn must be set only after all data/index has been
flushed and forced to disk. We thus move write_log_record_for_repair()
to after _ma_flush_tables_files_after_repair().
storage/maria/ma_checkpoint.c:
Checkpoint module.
storage/maria/ma_checkpoint.h:
optional argument if caller wants a thread to periodically take
checkpoints and flush dirty pages.
storage/maria/ma_create.c:
* no need to init some vars as the initial bzero(share) takes care of this.
* update to new function's name
* even if we fail in my_sync() we have to my_close()
storage/maria/ma_extra.c:
Checkpoint reads share->last_version under intern_lock, so we make
maria_extra() update it under intern_lock. THR_LOCK_maria still needed
because of _ma_test_if_reopen().
storage/maria/ma_init.c:
destroy checkpoint module when Maria shuts down.
storage/maria/ma_loghandler.c:
* UNDO_ROW_PURGE gone (see ma_blockrec.c)
* we need to remember the LSN of the LOGREC_FILE_ID for a share,
because this LSN is needed into the checkpoint record (Recovery wants
to know the validity domain of an id->name mapping)
* translog_get_horizon_no_lock() needed for Checkpoint
* comment about failing assertion (Sanja knows)
* translog_init_reader_data() thought that translog_read_record_header_scan()
returns 0 in case of error, but 0 just means "0-length header".
* translog_assign_id_to_share() now needs the MARIA_HA because
LOGREC_FILE_ID uses a log-write hook.
* Verify that (de)assignment of share->id happens only under intern_lock,
as Checkpoint reads this id with intern_lock.
* translog_purge() can accept TRANSLOG_ADDRESS, not necessarily
a real LSN.
storage/maria/ma_loghandler.h:
prototype updates
storage/maria/ma_open.c:
no need to initialize "res"
storage/maria/ma_pagecache.c:
When taking a checkpoint, we don't need to know the maximum rec_lsn
of dirty pages; this LSN was intended to be used in the two-checkpoint
rule, but last_checkpoint_lsn is as good.
4 bytes for stored_list_size is enough as PAGECACHE::blocks (number
of blocks which the pagecache can contain) is int.
storage/maria/ma_pagecache.h:
new prototype
storage/maria/ma_recovery.c:
* added replaying of REDO_RENAME_TABLE
* UNDO_ROW_PURGE gone (see ma_blockrec.c), replaced by CLR_END
* Recovery from the last checkpoint record now possible
* In new_table() we skip the table if the id->name mapping is older than
create_rename_lsn (mapping dates from lsn_of_file_id).
* in get_MARIA_HA_from_REDO_record() we skip the record
if the id->name mapping is newer than the record (can happen if processing
a record which is before the checkpoint record).
* parse_checkpoint_record() has to return a LSN, that's what caller expects
storage/maria/ma_rename.c:
new function's name; log end zeroes of tables' names (ease recovery)
storage/maria/ma_test2.c:
* equivalent of ma_test1's --test-undo added (named -u here).
* -t=1 now stops right after creating the table, so that
we can test undoing of INSERTs with duplicate keys (which tests the
CLR_END logged by _ma_write_abort_block_record()).
storage/maria/ma_test_recovery.expected:
Result of testing undoing of INSERTs with duplicate keys; there are
some differences in maria_chk -dvv but they are normal (removing
records does not shrink data/index file, does not put back the
"analyzed, optimized keys"(etc) index state.
storage/maria/ma_test_recovery:
Test undoing of INSERTs with duplicate keys, using ma_test2;
when such INSERT happens, it logs REDO_INSERT, UNDO_INSERT, REDO_DELETE,
CLR_END; we abort after that, and test that CLR_END causes recovery
to jump over UNDO_INSERT.
storage/maria/ma_write.c:
comment
storage/maria/maria_chk.c:
comment
storage/maria/maria_def.h:
* a new bit in MARIA_SHARE::in_checkpoint, used to build a list
of unique shares during Checkpoint.
* MARIA_SHARE::lsn_of_file_id added: the LSN of the last LOGREC_FILE_ID
for this share; needed to know to which LSN domain the mappings
found in the Checkpoint record apply (new mappings should not apply
to old REDOs).
storage/maria/trnman.c:
* small changes to how trnman_collect_transactions() fills its buffer;
it also uses a non-dummy lsn_read_non_atomic() found in ma_checkpoint.h
* testing of execution of UNDO_ROW_UPDATE
* when executing an UNDO_ROW_UPDATE, store "UNDO_ROW_UPDATE" as
"type of undone record" into the CLR_END record.
storage/maria/ma_blockrec.c:
When logging a CLR_END in write_block_record(), it can be for
a DELETE or for an UPDATE (now that Monty has coded execution of
UNDO_UPDATE)
storage/maria/ma_loghandler.c:
UNDO_ROW_UPDATE's execution coded, so no crash
storage/maria/ma_recovery.c:
UNDO_ROW_UPDATE's execution now coded, so no crash
storage/maria/ma_test1.c:
upper case letter
storage/maria/ma_test_recovery.expected:
output of testing execution of UNDO_ROW_UPDATE. Table's checksum
not recovered (known issue not specific to UPDATE).
storage/maria/ma_test_recovery:
Test execution of UNDO_ROW_UPDATE: first we stop ma_test1 after
deletes and commit, then we stop ma_test1 after updates and abort;
we verify that updates are rolled back by comparing tables
into mysql.com:/home/my/mysql-maria
storage/maria/ma_check.c:
Auto merged
storage/maria/ma_locking.c:
Auto merged
storage/maria/ma_loghandler.c:
Auto merged
storage/maria/ma_open.c:
Auto merged
storage/maria/ma_recovery.c:
Auto merged
storage/maria/maria_def.h:
Auto merged
storage/maria/maria_read_log.c:
Auto merged
storage/maria/ma_blockrec.c:
Manual merge
storage/maria/ma_test1.c:
Manual merge (using Guilhems code)
Fixed bug in duplicate key handling for block records during repair
All read-row methods now return error number in case of error
Don't calculate checksum for null fields
Fixed bug when running maria_read_log with -o
BUILD/SETUP.sh:
Added STACK_DIRECTION
BUILD/compile-pentium-debug-max:
Moved STACK_DIRECTION to SETUP
include/myisam.h:
Added extra parameter to write_key
storage/maria/ma_blockrec.c:
Added applying of undo for updates
Fixed indentation
Removed some not needed casts
Fixed wrong logging of CLR record
Split ma_update_block_record to two functions to be able to reuse it from undo-applying
Simplify filling of packed fields
ma_record_block_record) now returns error number on failure
Sligtly changed log record information for undo-update
storage/maria/ma_check.c:
Fixed bug in duplicate key handling for block records during repair
storage/maria/ma_checksum.c:
Don't calculate checksum for null fields
storage/maria/ma_dynrec.c:
_ma_read_dynamic_reocrd() now returns error number on error
Rest of the changes are code simplification and indentation fixes
storage/maria/ma_locking.c:
Added comment
storage/maria/ma_loghandler.c:
More debugging
Removed printing of total_record_length as this was always same as record_length
storage/maria/ma_open.c:
Allocate bitmap for changed fields
storage/maria/ma_packrec.c:
read_record now returns error number on error
storage/maria/ma_recovery.c:
Fixed wrong arguments to undo_row_update
storage/maria/ma_statrec.c:
read_record now returns error number on error (not 1)
Code simplification
storage/maria/ma_test1.c:
Added exit possibility after update phase (to test undo of updates)
storage/maria/maria_def.h:
Include bitmap header file
storage/maria/maria_read_log.c:
Fixed bug when running with -o
end of translog_flush() when datadir was in /dev/shm.
storage/maria/ma_loghandler.c:
directory syncing can fail on shared memory devices (/dev/shm on Linux
in this case); see my_sync_dir().
Recovery of state.records (the count of records which is stored into
the header of the index file). For that, state.is_of_lsn is introduced;
logic is explained in ma_recovery.c (look for "Recovery of the state").
The net gain is that in case of crash, we now recover state.records,
and it is idempotent (ma_test_recovery tests it).
state.checksum is not recovered yet, mail sent for discussion.
- WL#3071 Maria Checkpoint: preparation for it, by protecting
all modifications of the state in memory or on disk with intern_lock
(with the exception of the really-often-modified state.records,
which is now protected with the log's lock, see ma_recovery.c
(look for "Recovery of the state"). Also, if maria_close() sees that
Checkpoint is looking at this table it will not my_free() the share.
- don't compute row's checksum twice in case of UPDATE (correction
to a bugfix I made yesterday).
storage/maria/ha_maria.cc:
protect state write with intern_lock (against Checkpoint)
storage/maria/ma_blockrec.c:
* don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it
should wait until we have corrected the allocation in the bitmap
(as the REDO can serve to correct the allocation during Recovery);
introducing _ma_finalize_row() for that.
* In a changeset yesterday I moved computation of the checksum
into write_block_record(), to fix a bug in UPDATE. Now I notice
that maria_update() already computes the checksum, it's just that
it puts it into info->cur_row while _ma_update_block_record()
uses info->new_row; so, removing the checksum computation from
write_block_record(), putting it back into allocate_and_write_block_record()
(which is called only by INSERT and UNDO_DELETE), and copying
cur_row->checksum into new_row->checksum in _ma_update_block_record().
storage/maria/ma_check.c:
new prototypes, they will take intern_lock when writing the state;
also take intern_lock when changing share->kfile. In both cases
this is to protect against Checkpoint reading/writing the state or reading
kfile at the same time.
Not updating create_rename_lsn directly at end of write_log_record_for_repair()
as it wouldn't have intern_lock.
storage/maria/ma_close.c:
Checkpoint builds a list of shares (under THR_LOCK_maria), then it
handles each such share (under intern_lock) (doing flushing etc);
if maria_close() freed this share between the two, Checkpoint
would see a bad pointer. To avoid this, when building the list Checkpoint
marks each share, so that maria_close() knows it should not free it
and Checkpoint will free it itself.
Extending the zone covered by intern_lock to protect against
Checkpoint reading kfile, writing state.
storage/maria/ma_create.c:
When we update create_rename_lsn, we also update is_of_lsn to
the same value: it is logical, and allows us to test in maria_open()
that the former is not bigger than the latter (the contrary is a sign
of index header corruption, or severe logging bug which hinders
Recovery, table needs a repair).
_ma_update_create_rename_lsn_on_disk() also writes is_of_lsn;
it now operates under intern_lock (protect against Checkpoint),
a shortcut function is available for cases where acquiring
intern_lock is not needed (table's creation or first open).
storage/maria/ma_delete.c:
if table is transactional, "records" is already decremented
when logging UNDO_ROW_DELETE.
storage/maria/ma_delete_all.c:
comments
storage/maria/ma_extra.c:
Protect modifications of the state, in memory and/or on disk,
with intern_lock, against a concurrent Checkpoint.
When state goes to disk, update it's is_of_lsn (by calling
the new _ma_state_info_write()).
In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing
a change I made a few days ago) and ASK_MONTY
storage/maria/ma_locking.c:
no real code change here.
storage/maria/ma_loghandler.c:
Log-write-hooks for updating "state.records" under log's mutex
when writing/updating/deleting a row or deleting all rows.
storage/maria/ma_loghandler_lsn.h:
merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different)
storage/maria/ma_open.c:
When opening a table verify that is_of_lsn >= create_rename_lsn; if
false the header must be corrupted.
_ma_state_info_write() is split in two: _ma_state_info_write_sub()
which is the old _ma_state_info_write(), and _ma_state_info_write()
which additionally takes intern_lock if requested (to protect
against Checkpoint) and updates is_of_lsn.
_ma_open_keyfile() should change kfile.file under intern_lock
to protect Checkpoint from reading a wrong kfile.file.
storage/maria/ma_recovery.c:
Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT
which has a LSN > state.is_of_lsn it increments state.records.
Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE.
When closing a table during Recovery, we know its state is at least
as new as the current log record we are looking at, so increase
is_of_lsn to the LSN of the current log record.
storage/maria/ma_rename.c:
update for new behaviour of _ma_update_create_rename_lsn_on_disk().
storage/maria/ma_test1.c:
update to new prototype
storage/maria/ma_test2.c:
update to new prototype (actually prototype was changed days ago,
but compiler does not complain about the extra argument??)
storage/maria/ma_test_recovery.expected:
new result file of ma_test_recovery. Improvements: record
count read from index's header is now always correct.
storage/maria/ma_test_recovery:
"rm" fails if file does not exist. Redirect stderr of script.
storage/maria/ma_write.c:
if table is transactional, "records" is already incremented when
logging UNDO_ROW_INSERT. Comments.
storage/maria/maria_chk.c:
update is_of_lsn too
storage/maria/maria_def.h:
- MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored
into the index file's header.
- Checkpoint can now mark a table as "don't free this", and maria_close()
can reply "ok then you will free it".
- new functions
storage/maria/maria_pack.c:
update for new name
misc fixes of execution of UNDOs in the UNDO phase:
- into the CLR_END, store the LSN of the _previous_ UNDO (we debated
what was best, so far we're going with "previous"; later we can change
to "current" if needed), and store the type of record which is being
undone (needed to know how to update state.records when we see the
CLR_END during the REDO phase).
- declaring all UNDOs and CLR_END as "compressed"
- when executing an UNDO in the UNDO phase, state.records is updated
as a hook when writing CLR_END (needed for "recovery of the state"),
and so is trn->undo_lsn (needed for when we have checkpoints).
- bugfix (execution of UNDO_ROW_DELETE didn't store the correct checksum
into the re-inserted row, maria_chk -r thus threw the row away).
- modifications of ma_test1: where to stop is now driven by --testflag;
--test-undo just tells how to stop (flush data, flush log, nothing).
- ma_test_recovery: testing of the UNDO phase, more testing of the
REDO phase, identification of a bug.
storage/maria/ma_blockrec.c:
- bugfix: execution of UNDO_ROW_DELETE didn't store the correct
checksum into the row (leading to "maria_chk -r" eliminating the
re-inserted row, net effect was that rollback appeared to have
rolled back no deletion). Reason was that write_block_record() used
info->cur_row.checksum, while "row" can be != &info->cur_row
(case of UNDO_ROW_DELETE). After fixing this, problems with
_ma_update_block_record() appeared; indeed checksum was computed
by allocate_and_write_block_record() while _ma_update_block_record()
directly calls write_block_record(). Solution is to compute checksum
in write_block_record() instead.
- when executing an UNDO, we now pass the LSN of the _previous_ UNDO
to block_format functions. This LSN can be 0 (if the being-executed UNDO
was the transaction's first UNDO), so "undo_lsn==0" cannot work
anymore to indicate "this is not UNDO work". Using undo_lsn==LSN_ERROR
instead (this is an impossible LSN).
- store into CLR_END the type of log record which was undone
(INSERT/UPDATE/DELETE); needed for Recovery to know if/how it has
to update state.records if it sees this CLR_END in the REDO phase.
- when writing the CLR_END in _ma_apply_undo_row_insert(),
the place to store file's id is log_data+LSN_STORE_SIZE.
- in _ma_apply_undo_row_insert(), the records-- is moved
to a hook when writing the CLR_END (this way it is under log's mutex
which is needed for "recovery of the state")
storage/maria/ma_loghandler.c:
- all UNDOs, and CLR_END, start with the LSN of another UNDO; so
we can declare them "compressed".
- write_hook_for_clr_end() to set trn->undo_lsn (to the previous
UNDO's LSN) under log's lock (like UNDOs set trn->undo_lsn under log's
lock), and also update, if appropriate, state.records.
- reset share->id to 0 when deassigning; not useful for now but
sounds logical.
storage/maria/ma_recovery.c:
- if no table is found for a REDO, it's not an error; for an UNDO, it is
- in the REDO phase, when we see a CLR_END we must update trn->undo_lsn
and sometimes state.records.
- in the UNDO phase, when we execute an UNDO_ROW_INSERT:
* update trn->undo_lsn only after executing the record
* store the _previous_ undo_lsn into the CLR_END
- at the end of the REDO phase, when we recreate TRN objects, they
have already their long id in the log (either via a
LOGREC_LONG_TRANSACTION_ID, or in a checkpoint record), don't write
a new, useless LOGREC_LONG_TRANSACTION_ID for them.
storage/maria/ma_test1.c:
* where to stop execution is now driven by --testflag and not --test-undo
(ma_test2 already has --testflag for the same purpose). This allows
us to do a clean stop (with commit) at any point.
* --test-undo=# tells how to abort (flush all pages (which implies
flushing log) or only log or nothing); all such "ways of crashing"
are tested in ma_test_recovery
storage/maria/ma_test_recovery:
* Testing execution of UNDOs, with and without BLOBs.
* Testing idempotency of REDOs.
* See @todo for a probable bug with BLOBs.
* maria_chk -rq instead of -r, as with -q it nicely stops on any
problem in the data file (like the checksum bug see comment of
ma_blockrec.c).
* Testing if log was written by UNDO phase (often expected),
not written by REDO phase (always expected).
* Less output on the screen, compares with expected output in the end.
* some shell thingies like "set --" and $# are courtesy of
Danny and Pekka.
storage/maria/maria_read_log.c:
when only displaying the records, don't do an UNDO phase
storage/maria/ma_test_recovery.expected:
This is the expected output of a great part of ma_test_recovery.
ma_test_recovery compares its output to the expected output
and tells if different.
If we look at this file it mentions differences in checksum
(normal, it's not recovered yet) and in records count
(getting a correct records' count when recovery starts on an
already existing table, like when testing rollback,
is coded but not yet pushed).
Small fixes made.
storage/maria/ma_loghandler.c:
Check of transaction log descriptor table consistance added.\
Incorrect record description fixed.
Compiler warning fixed.
storage/maria/ma_loghandler.h:
fixed ident.
storage/maria/unittest/ma_test_loghandler-t.c:
Suppressing of automatic record writing
storage/maria/unittest/ma_test_loghandler_first_lsn-t.c:
Suppressing of automatic record writing
storage/maria/unittest/ma_test_loghandler_max_lsn-t.c:
Suppressing of automatic record writing
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
Suppressing of automatic record writing
storage/maria/unittest/ma_test_loghandler_multithread-t.c:
Suppressing of automatic record writing
storage/maria/unittest/ma_test_loghandler_noflush-t.c:
Suppressing of automatic record writing
storage/maria/unittest/ma_test_loghandler_pagecache-t.c:
Suppressing of automatic record writing
storage/maria/unittest/ma_test_loghandler_purge-t.c:
Suppressing of automatic record writing
storage/maria/ma_blockrec.c:
Added UNDO handling of insert during recovery
To do this, I also had to add write locking of tail pages during undo phase (As we need to access the same page twice if extents are split over two pages)
Another way to handle the undo of insert would be to store the extent information as part of the UNDO_INSERT block.
storage/maria/ma_blockrec.h:
Added new prototype
storage/maria/ma_loghandler.c:
Changed type of CLR_END (to avoid crash in log handler)
Removed not used variable
storage/maria/ma_loghandler.h:
Added TRN argument to record_execute_in_undo_phase()
storage/maria/ma_pagecache.c:
Hack for undo phase of recovery. During REDO we work with PLAIN pages, but UNDO works with LSN pages, which caused an abort when trying to access a cached page.
storage/maria/ma_recovery.c:
Added execution of UNDO_ROW_INSERT
storage/maria/ma_test1.c:
Added option --test-undo for testing recovery with undo
storage/maria/maria_read_log.c:
Added processing of undos
into desktop.sanja.is.com.ua:/home/bell/mysql/bk/work-maria-purge
storage/maria/ma_loghandler.c:
Auto merged
storage/maria/ma_loghandler.h:
Auto merged
into gbichot4.local:/home/mysql_src/mysql-maria-for-undo-phase
storage/maria/ha_maria.cc:
Auto merged
storage/maria/ma_blockrec.c:
Auto merged
storage/maria/ma_loghandler.c:
Auto merged
storage/maria/ma_loghandler.h:
Auto merged
storage/maria/ma_loghandler_lsn.h:
Auto merged
storage/maria/maria_chk.c:
Auto merged
storage/maria/maria_read_log.c:
Auto merged
* create page cache before initializing engine and not after, because
Maria's recovery needs a page cache
* make the creation of a bitmap page more crash-resistent
* bugfix (see ma_blockrec.c)
* back to old way: create an 8k bitmap page when creating table
* preparations for the UNDO phase: recreate TRNs
* preparations for Checkpoint: list of dirty pages, testing
of rec_lsn to know if page should be skipped during Recovery
(unused in this patch as no Checkpoint module pushed yet)
* maria_chk tags repaired table with a special LSN
* reworking all around in ma_recovery.c (less duplication)
mysys/my_realloc.c:
noted an issue in my_realloc()
sql/mysqld.cc:
page cache needs to be created before engines are initialized,
because Maria's initialization may do a recovery which needs
the page cache.
storage/maria/ha_maria.cc:
update to new prototype
storage/maria/ma_bitmap.c:
when creating the first bitmap page we used chsize to 8192 bytes then
pwrite (overwrite) the last 2 bytes (8191-8192). If crash between
the two operations, this leaves a bitmap page full without its end
marker. A later recovery may try to read this page and find it
exists and misses a marker and conclude it's corrupted and fail.
Changing the chsize to only 8190 bytes: recovery will then find
the page is too short and recreate it entirely.
storage/maria/ma_blockrec.c:
Fix for a bug: when executing a REDO, if the data page is created,
data_file_length was increased before _ma_bitmap_set():
_ma_bitmap_set() called _ma_read_bitmap_page() which, due to the
increased data_file_length, expected to find a bitmap page on disk
with a correct end marker; if the bitmap page didn't exist already
in fact, this failed. Fixed by increasing data_file_length only after
_ma_read_bitmap_page() has created the new bitmap page correctly.
This bug could happen every time a REDO is about creating a new
bitmap page.
storage/maria/ma_check.c:
empty data file has a bitmap page
storage/maria/ma_control_file.c:
useless parameter to ma_control_file_create_or_open(), just
test if this is recovery.
storage/maria/ma_control_file.h:
new prototype
storage/maria/ma_create.c:
Back to how it was before: maria_create() creates an 8k bitmap page.
Thus (bugfix) data_file_length needs to reflect this instead of being 0.
storage/maria/ma_loghandler.c:
as ma_test1 and ma_test2 now use real transactions and not
dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always
about real transactions, can assert this.
A function for Recovery to assign a short id to a table.
storage/maria/ma_loghandler.h:
new function
storage/maria/ma_loghandler_lsn.h:
maria_chk tags repaired tables with this LSN
storage/maria/ma_open.c:
* enforce that DMLs on transactional tables use real transactions
and not dummy_transaction_object.
* test if table was repaired with maria_chk (which has to been
seen as an import of an external table into the server), test
validity of create_rename_lsn (header corruption detection)
* comments.
storage/maria/ma_recovery.c:
* preparations for the UNDO phase: recreate TRNs
* preparations for Checkpoint: list of dirty pages, testing
of rec_lsn to know if page should be skipped during Recovery
(unused in this patch as no Checkpoint module pushed yet)
* reworking all around (less duplication)
storage/maria/ma_recovery.h:
a parameter to say if the UNDO phase should be skipped
storage/maria/maria_chk.c:
tag repaired tables with a special LSN
storage/maria/maria_read_log.c:
* update to new prototype
* no UNDO phase in maria_read_log for now
storage/maria/trnman.c:
* a function for Recovery to create a transaction (TRN), needed
in the UNDO phase
* a function for Recovery to grab an existing transaction, needed
in the UNDO phase (rollback all existing transactions)
storage/maria/trnman_public.h:
new functions
Now ma_test1 -M -T and ma_test2 -M -T produces readable, applyable logs
Note: The .MAD file is not binary identical after applying redo compare to a an original file.
(This is becasue we don't have full information which function called PURGE_REDO_BLOCKS).
To verify if a file was correctly applied, we now instead compare row checksums
BitKeeper/etc/ignore:
added storage/maria/tmp/*
include/maria.h:
Added maria_commit() and maria_begin() to be used with external tests
storage/maria/ha_maria.cc:
Ensure maria_def. is read in C mode
storage/maria/ma_blockrec.c:
Fixed redo handling.
_ma_apply_redo_purge_blocks() updated to handle any number of purged blocks
Removed code to make data file idenitcal after redo (can't easily be done). See changeset comments
Now ma_test1 -M -T and ma_test2 -M -T produces readable, applyable logs
storage/maria/ma_commit.c:
More DBUG statements
Moved variable declaration to start of function (portability fix)
Added helper functions 'maria_commit()' and 'maria_begin()'
storage/maria/ma_loghandler.c:
Fixed wrong REDO_PURGE_BLOCKS initialization
storage/maria/ma_recovery.c:
Added UNDO_ROW_UPDATE
Removed wrong setting of lsn (there was no lsn at the used position)
Fixed REDO_PURGE_BLOCKS to handle any number of blocks
storage/maria/ma_test1.c:
Added transaction support (via maria_begin() & maria_commit()) to get a log that can be applied with maria_read_log
storage/maria/ma_test2.c:
Added transaction support (via maria_begin() & maria_commit()) to get a log that can be applied with maria_read_log
storage/maria/ma_test_recovery:
Create temporary files in maria/tmp
Verify files with checksums instead of byte comparisons
storage/maria/maria_chk.c:
When using with -dss we only get filename, records and checksum.
This is useful to do a quick comparision if a files is identical to another one.
storage/maria/maria_def.h:
Added ma_commit()
storage/maria/maria_read_log.c:
Added --help
File number types fixed.
storage/maria/unittest/Makefile.am:
Test of purger added.
storage/maria/unittest/ma_test_loghandler_purge-t.c:
New BitKeeper file ``storage/maria/unittest/ma_test_loghandler_purge-t.c''
in the file to the file header.
storage/maria/ma_loghandler.h:
Getting maximum LSN of the record which parts written
in the file to the file header.
storage/maria/unittest/Makefile.am:
Test suite for getting max LSN added.
storage/maria/unittest/ma_test_loghandler_first_lsn-t.c:
Spelling fixed.
Cleanup fixed.
storage/maria/unittest/ma_test_loghandler_noflush-t.c:
Cleanup fixed.
storage/maria/unittest/ma_test_loghandler_max_lsn-t.c:
New BitKeeper file ``storage/maria/unittest/ma_test_loghandler_max_lsn-t.c''
storage/maria/ma_checkpoint.c:
Definitions of LSN should be collected in
the one file (ma_loghandler_lsn.h)
storage/maria/ma_loghandler.c:
New calls to get first theoretical and first real LSN.
storage/maria/ma_loghandler.h:
New calls to get first theoretical and first real LSN.
storage/maria/ma_loghandler_lsn.h:
Defined yet another impossible LSN to indicate error.
storage/maria/ma_recovery.c:
The first LSN call changed.
storage/maria/maria_read_log.c:
The first LSN call changed.
storage/maria/unittest/Makefile.am:
New unittest added.
storage/maria/unittest/ma_test_loghandler_first_lsn-t.c:
New BitKeeper file ``storage/maria/unittest/ma_test_loghandler_first_lsn-t.c''