mariadb/storage/maria/ma_commit.c

125 lines
3.8 KiB
C
Raw Normal View History

- WL#3239 "log CREATE TABLE in Maria" - WL#3240 "log DROP TABLE in Maria" - similarly, log RENAME TABLE, REPAIR/OPTIMIZE TABLE, and DELETE no_WHERE_clause (== the DELETE which just truncates the files) - create_rename_lsn added to MARIA_SHARE's state - all these operations (except DROP TABLE) also update the table's create_rename_lsn, which is needed for the correctness of Recovery (see function comment of _ma_repair_write_log_record() in ma_check.c) - write a COMMIT record when transaction commits. - don't log REDOs/UNDOs if this is an internal temporary table like inside ALTER TABLE (I expect this to be a big win). There was already no logging for user-created "CREATE TEMPORARY" tables. - don't fsync files/directories if the table is not transactional - in translog_write_record(), autogenerate a 2-byte-id for the table and log the "id->name" pair (LOGREC_FILE_ID); log LOGREC_LONG_TRANSACTION_ID; automatically store the table's 2-byte-id in any log record. - preparations for Checkpoint: translog_get_horizon(); pausing Checkpoint when some dirty pages are unknown; capturing trn->rec_lsn, trn->first_undo_lsn for Checkpoint and log's low-water-mark computing. - assertions, comments. storage/maria/Makefile.am: more files to build storage/maria/ha_maria.cc: - logging a REPAIR log record if REPAIR/OPTIMIZE was successful. - ha_maria::data_file_type does not have to be set in every info() call, just do it once in open(). - if caller said that transactionality can be disabled (like if caller is ALTER TABLE) i.e. thd->transaction.on==FALSE, then we temporarily disable transactionality of the table in external_lock(); that will ensure that no REDOs/UNDOs are logged for this possibly massive write operation (they are not needed, as if any write fails, the table will be dropped). We re-enable in external_lock(F_UNLCK), which in ALTER TABLE happens before the tmp table replaces the original one (which is good, as thus the final table will have a REDO RENAME and a correct create_rename_lsn). - when we commit we also have to write a log record, so trnman_commit_trn() calls become ma_commit() calls - at end of engine's initialization, we are potentially entering a multi-threaded dangerous world (clients are going to be accepted) and so some assertions of mutex-owning become enforceable, for that we set maria_multi_threaded=TRUE (see ma_control_file.c) storage/maria/ha_maria.h: new member ha_maria::save_transactional (see also ha_maria.cc) storage/maria/ma_blockrec.c: - fixing comments according to discussion with Monty - if a table is transactional but temporarily non-transactional (like in ALTER TABLE), we need to give a sensible LSN to the pages (and, if we give 0, pagecache asserts). - translog_write_record() now takes care of storing the share's 2-byte-id in the log record storage/maria/ma_blockrec.h: fixing comment according to discussion with Monty storage/maria/ma_check.c: When REPAIR/OPTIMIZE modify the data/index file, if this is a transactional table, they must sync it; if they remove files or rename files, they must sync the directory, so that everything is durable. This is just applying to REPAIR/OPTIMIZE the logic already implemented in CREATE/DROP/RENAME a few months ago. Adding a function to write a LOGREC_REPAIR_TABLE at end of REPAIR/OPTIMIZE (called only by ha_maria, not by maria_chk), and to update the table's create_rename_lsn. storage/maria/ma_close.c: fix for a future bug storage/maria/ma_control_file.c: ensuring that if Maria is running in multi-threaded mode, anybody wanting to write to the control file and update last_checkpoint_lsn/last_logno owns the log's lock. storage/maria/ma_control_file.h: see ma_control_file.c storage/maria/ma_create.c: when creating a table: - sync it and its directory only if this is a transactional table and there is a log (no point in syncing in maria_chk) - decouple the two uses of linkname/linkname_ptr (for index file and for data file) into more variables, as we need to know all links until the moment we write the LOGREC_CREATE_TABLE. - set share.data_file_type early so that _ma_initialize_data_file() knows it (Monty's bugfix so that a table always has at least a bitmap page when it is created; so data-file is not 0 bytes anymore). - log a LOGREC_CREATE_TABLE; it contains the bytes which we have just written to the index file's header. Update table's create_rename_lsn. - syncing of kfile had been bugified in a previous merge, correcting - syncing of dfile is now needed as it's not empty anymore - in _ma_initialize_data_file(), use share's block_size and not the global one. This is a gratuitous change, both variables are equal, just that I find it more future-proof to use share-bound variable rather than global one. storage/maria/ma_delete_all.c: log a LOGREC_DELETE_ALL record when doing ma_delete_all_rows(); update create_rename_lsn then. storage/maria/ma_delete_table.c: - logging LOGREC_DROP_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - we need to sync directories only if the table is transactional storage/maria/ma_extra.c: questions storage/maria/ma_init.c: when maria_end() is called, engine is not multithreaded storage/maria/ma_loghandler.c: - translog_inited has to be visible to ma_create() (see how it is used in ma_create()) - checkpoint record will be a single record, not three - no REDO for TRUNCATE (TRUNCATE calls ma_create() internally so will log a REDO_CREATE) - adding REDO for DELETE no_WHERE_clause (fast DELETE of all rows by truncating the files), REPAIR. - MY_WAIT_IF_FULL to wait&retry if a log write hits a full disk - in translog_write_record(), if MARIA_SHARE does not yet have a 2-byte-id, generate one for it and log LOGREC_FILE_ID; automatically store this short id into log records. - in translog_write_record(), if transaction has not logged its long trid, log LOGREC_LONG_TRANSACTION_ID. - For Checkpoint, we need to know the current end-of-log: adding translog_get_horizon(). - For Control File, adding an assertion that the thread owns the log's lock (control file is protected by this lock) storage/maria/ma_loghandler.h: Changes in log records (see ma_loghandler.c). new prototypes, new functions. storage/maria/ma_loghandler_lsn.h: adding a type LSN_WITH_FLAGS especially for TRN::first_undo_lsn, where the most significant byte is used for flags. storage/maria/ma_open.c: storing the create_rename_lsn in the index file's header (in the state, precisely) and retrieving it from there. storage/maria/ma_pagecache.c: - my set_if_bigger was wrong, correcting it - if the first_in_switch list is not empty, it means that changed_blocks misses some dirty pages, so Checkpoint cannot run and needs to wait. A variable missing_blocks_in_changed_list is added to tell that (should it be named missing_blocks_in_changed_blocks?) - pagecache_collect_changed_blocks_with_lsn() now also tells the minimum rec_lsn (needed for low-water mark computation). storage/maria/ma_pagecache.h: see ma_pagecache.c storage/maria/ma_panic.c: comment storage/maria/ma_range.c: comment storage/maria/ma_rename.c: - logging LOGREC_RENAME_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - update create_rename_lsn - we need to sync directories only if the table is transactional storage/maria/ma_static.c: comment storage/maria/ma_test_all.sh: - tip for Valgrind-ing ma_test_all - do "export maria_path=somepath" before calling ma_test_all, if you want to run ma_test_all out of storage/maria (useful to have parallel runs, like one normal and one Valgrind, they must not use the same tables so need to run in different directories) storage/maria/maria_def.h: - state now contains, in memory and on disk, the create_rename_lsn - share now contains a 2-byte-id storage/maria/trnman.c: preparations for Checkpoint: capture trn->rec_lsn, trn->first_undo_lsn; minimum first_undo_lsn needed to know log's low-water-mark storage/maria/trnman.h: using most significant byte of first_undo_lsn to hold miscellaneous flags, for now TRANSACTION_LOGGED_LONG_ID. dummy_transaction_object is already declared in ma_static.c. storage/maria/trnman_public.h: dummy_transaction_object was declared in all files including trnman_public.h, while in fact it's a single object. new prototype storage/maria/unittest/ma_test_loghandler-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multigroup-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multithread-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: update for new prototype storage/maria/ma_commit.c: function which wraps: - writing a LOGREC_COMMIT record (==commit on disk) - calling trnman_commit_trn() (=commit in memory) storage/maria/ma_commit.h: new header file .tree-is-private: this file is now needed to keep our tree private (don't push it to public trees). When 5.1 is merged into mysql-maria, we can abandon our maria-specific post-commit trigger; .tree_is_private will take care of keeping commit mails private. Don't push this file to public trees.
2007-06-22 14:49:37 +02:00
/* Copyright (C) 2007 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#include "maria_def.h"
#include "trnman.h"
/**
@brief writes a COMMIT record to log and commits transaction in memory
@param trn transaction
@return Operation status
@retval 0 ok
@retval 1 error (disk error or out of memory)
*/
int ma_commit(TRN *trn)
{
Added maria_commit() and maria_begin() to be used with external tests Now ma_test1 -M -T and ma_test2 -M -T produces readable, applyable logs Note: The .MAD file is not binary identical after applying redo compare to a an original file. (This is becasue we don't have full information which function called PURGE_REDO_BLOCKS). To verify if a file was correctly applied, we now instead compare row checksums BitKeeper/etc/ignore: added storage/maria/tmp/* include/maria.h: Added maria_commit() and maria_begin() to be used with external tests storage/maria/ha_maria.cc: Ensure maria_def. is read in C mode storage/maria/ma_blockrec.c: Fixed redo handling. _ma_apply_redo_purge_blocks() updated to handle any number of purged blocks Removed code to make data file idenitcal after redo (can't easily be done). See changeset comments Now ma_test1 -M -T and ma_test2 -M -T produces readable, applyable logs storage/maria/ma_commit.c: More DBUG statements Moved variable declaration to start of function (portability fix) Added helper functions 'maria_commit()' and 'maria_begin()' storage/maria/ma_loghandler.c: Fixed wrong REDO_PURGE_BLOCKS initialization storage/maria/ma_recovery.c: Added UNDO_ROW_UPDATE Removed wrong setting of lsn (there was no lsn at the used position) Fixed REDO_PURGE_BLOCKS to handle any number of blocks storage/maria/ma_test1.c: Added transaction support (via maria_begin() & maria_commit()) to get a log that can be applied with maria_read_log storage/maria/ma_test2.c: Added transaction support (via maria_begin() & maria_commit()) to get a log that can be applied with maria_read_log storage/maria/ma_test_recovery: Create temporary files in maria/tmp Verify files with checksums instead of byte comparisons storage/maria/maria_chk.c: When using with -dss we only get filename, records and checksum. This is useful to do a quick comparision if a files is identical to another one. storage/maria/maria_def.h: Added ma_commit() storage/maria/maria_read_log.c: Added --help
2007-08-29 09:03:10 +03:00
int res;
LSN commit_lsn;
LEX_STRING log_array[TRANSLOG_INTERNAL_PARTS];
DBUG_ENTER("ma_commit");
- WL#3239 "log CREATE TABLE in Maria" - WL#3240 "log DROP TABLE in Maria" - similarly, log RENAME TABLE, REPAIR/OPTIMIZE TABLE, and DELETE no_WHERE_clause (== the DELETE which just truncates the files) - create_rename_lsn added to MARIA_SHARE's state - all these operations (except DROP TABLE) also update the table's create_rename_lsn, which is needed for the correctness of Recovery (see function comment of _ma_repair_write_log_record() in ma_check.c) - write a COMMIT record when transaction commits. - don't log REDOs/UNDOs if this is an internal temporary table like inside ALTER TABLE (I expect this to be a big win). There was already no logging for user-created "CREATE TEMPORARY" tables. - don't fsync files/directories if the table is not transactional - in translog_write_record(), autogenerate a 2-byte-id for the table and log the "id->name" pair (LOGREC_FILE_ID); log LOGREC_LONG_TRANSACTION_ID; automatically store the table's 2-byte-id in any log record. - preparations for Checkpoint: translog_get_horizon(); pausing Checkpoint when some dirty pages are unknown; capturing trn->rec_lsn, trn->first_undo_lsn for Checkpoint and log's low-water-mark computing. - assertions, comments. storage/maria/Makefile.am: more files to build storage/maria/ha_maria.cc: - logging a REPAIR log record if REPAIR/OPTIMIZE was successful. - ha_maria::data_file_type does not have to be set in every info() call, just do it once in open(). - if caller said that transactionality can be disabled (like if caller is ALTER TABLE) i.e. thd->transaction.on==FALSE, then we temporarily disable transactionality of the table in external_lock(); that will ensure that no REDOs/UNDOs are logged for this possibly massive write operation (they are not needed, as if any write fails, the table will be dropped). We re-enable in external_lock(F_UNLCK), which in ALTER TABLE happens before the tmp table replaces the original one (which is good, as thus the final table will have a REDO RENAME and a correct create_rename_lsn). - when we commit we also have to write a log record, so trnman_commit_trn() calls become ma_commit() calls - at end of engine's initialization, we are potentially entering a multi-threaded dangerous world (clients are going to be accepted) and so some assertions of mutex-owning become enforceable, for that we set maria_multi_threaded=TRUE (see ma_control_file.c) storage/maria/ha_maria.h: new member ha_maria::save_transactional (see also ha_maria.cc) storage/maria/ma_blockrec.c: - fixing comments according to discussion with Monty - if a table is transactional but temporarily non-transactional (like in ALTER TABLE), we need to give a sensible LSN to the pages (and, if we give 0, pagecache asserts). - translog_write_record() now takes care of storing the share's 2-byte-id in the log record storage/maria/ma_blockrec.h: fixing comment according to discussion with Monty storage/maria/ma_check.c: When REPAIR/OPTIMIZE modify the data/index file, if this is a transactional table, they must sync it; if they remove files or rename files, they must sync the directory, so that everything is durable. This is just applying to REPAIR/OPTIMIZE the logic already implemented in CREATE/DROP/RENAME a few months ago. Adding a function to write a LOGREC_REPAIR_TABLE at end of REPAIR/OPTIMIZE (called only by ha_maria, not by maria_chk), and to update the table's create_rename_lsn. storage/maria/ma_close.c: fix for a future bug storage/maria/ma_control_file.c: ensuring that if Maria is running in multi-threaded mode, anybody wanting to write to the control file and update last_checkpoint_lsn/last_logno owns the log's lock. storage/maria/ma_control_file.h: see ma_control_file.c storage/maria/ma_create.c: when creating a table: - sync it and its directory only if this is a transactional table and there is a log (no point in syncing in maria_chk) - decouple the two uses of linkname/linkname_ptr (for index file and for data file) into more variables, as we need to know all links until the moment we write the LOGREC_CREATE_TABLE. - set share.data_file_type early so that _ma_initialize_data_file() knows it (Monty's bugfix so that a table always has at least a bitmap page when it is created; so data-file is not 0 bytes anymore). - log a LOGREC_CREATE_TABLE; it contains the bytes which we have just written to the index file's header. Update table's create_rename_lsn. - syncing of kfile had been bugified in a previous merge, correcting - syncing of dfile is now needed as it's not empty anymore - in _ma_initialize_data_file(), use share's block_size and not the global one. This is a gratuitous change, both variables are equal, just that I find it more future-proof to use share-bound variable rather than global one. storage/maria/ma_delete_all.c: log a LOGREC_DELETE_ALL record when doing ma_delete_all_rows(); update create_rename_lsn then. storage/maria/ma_delete_table.c: - logging LOGREC_DROP_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - we need to sync directories only if the table is transactional storage/maria/ma_extra.c: questions storage/maria/ma_init.c: when maria_end() is called, engine is not multithreaded storage/maria/ma_loghandler.c: - translog_inited has to be visible to ma_create() (see how it is used in ma_create()) - checkpoint record will be a single record, not three - no REDO for TRUNCATE (TRUNCATE calls ma_create() internally so will log a REDO_CREATE) - adding REDO for DELETE no_WHERE_clause (fast DELETE of all rows by truncating the files), REPAIR. - MY_WAIT_IF_FULL to wait&retry if a log write hits a full disk - in translog_write_record(), if MARIA_SHARE does not yet have a 2-byte-id, generate one for it and log LOGREC_FILE_ID; automatically store this short id into log records. - in translog_write_record(), if transaction has not logged its long trid, log LOGREC_LONG_TRANSACTION_ID. - For Checkpoint, we need to know the current end-of-log: adding translog_get_horizon(). - For Control File, adding an assertion that the thread owns the log's lock (control file is protected by this lock) storage/maria/ma_loghandler.h: Changes in log records (see ma_loghandler.c). new prototypes, new functions. storage/maria/ma_loghandler_lsn.h: adding a type LSN_WITH_FLAGS especially for TRN::first_undo_lsn, where the most significant byte is used for flags. storage/maria/ma_open.c: storing the create_rename_lsn in the index file's header (in the state, precisely) and retrieving it from there. storage/maria/ma_pagecache.c: - my set_if_bigger was wrong, correcting it - if the first_in_switch list is not empty, it means that changed_blocks misses some dirty pages, so Checkpoint cannot run and needs to wait. A variable missing_blocks_in_changed_list is added to tell that (should it be named missing_blocks_in_changed_blocks?) - pagecache_collect_changed_blocks_with_lsn() now also tells the minimum rec_lsn (needed for low-water mark computation). storage/maria/ma_pagecache.h: see ma_pagecache.c storage/maria/ma_panic.c: comment storage/maria/ma_range.c: comment storage/maria/ma_rename.c: - logging LOGREC_RENAME_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - update create_rename_lsn - we need to sync directories only if the table is transactional storage/maria/ma_static.c: comment storage/maria/ma_test_all.sh: - tip for Valgrind-ing ma_test_all - do "export maria_path=somepath" before calling ma_test_all, if you want to run ma_test_all out of storage/maria (useful to have parallel runs, like one normal and one Valgrind, they must not use the same tables so need to run in different directories) storage/maria/maria_def.h: - state now contains, in memory and on disk, the create_rename_lsn - share now contains a 2-byte-id storage/maria/trnman.c: preparations for Checkpoint: capture trn->rec_lsn, trn->first_undo_lsn; minimum first_undo_lsn needed to know log's low-water-mark storage/maria/trnman.h: using most significant byte of first_undo_lsn to hold miscellaneous flags, for now TRANSACTION_LOGGED_LONG_ID. dummy_transaction_object is already declared in ma_static.c. storage/maria/trnman_public.h: dummy_transaction_object was declared in all files including trnman_public.h, while in fact it's a single object. new prototype storage/maria/unittest/ma_test_loghandler-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multigroup-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multithread-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: update for new prototype storage/maria/ma_commit.c: function which wraps: - writing a LOGREC_COMMIT record (==commit on disk) - calling trnman_commit_trn() (=commit in memory) storage/maria/ma_commit.h: new header file .tree-is-private: this file is now needed to keep our tree private (don't push it to public trees). When 5.1 is merged into mysql-maria, we can abandon our maria-specific post-commit trigger; .tree_is_private will take care of keeping commit mails private. Don't push this file to public trees.
2007-06-22 14:49:37 +02:00
if (trn->undo_lsn == 0) /* no work done, rollback (cheaper than commit) */
Added maria_commit() and maria_begin() to be used with external tests Now ma_test1 -M -T and ma_test2 -M -T produces readable, applyable logs Note: The .MAD file is not binary identical after applying redo compare to a an original file. (This is becasue we don't have full information which function called PURGE_REDO_BLOCKS). To verify if a file was correctly applied, we now instead compare row checksums BitKeeper/etc/ignore: added storage/maria/tmp/* include/maria.h: Added maria_commit() and maria_begin() to be used with external tests storage/maria/ha_maria.cc: Ensure maria_def. is read in C mode storage/maria/ma_blockrec.c: Fixed redo handling. _ma_apply_redo_purge_blocks() updated to handle any number of purged blocks Removed code to make data file idenitcal after redo (can't easily be done). See changeset comments Now ma_test1 -M -T and ma_test2 -M -T produces readable, applyable logs storage/maria/ma_commit.c: More DBUG statements Moved variable declaration to start of function (portability fix) Added helper functions 'maria_commit()' and 'maria_begin()' storage/maria/ma_loghandler.c: Fixed wrong REDO_PURGE_BLOCKS initialization storage/maria/ma_recovery.c: Added UNDO_ROW_UPDATE Removed wrong setting of lsn (there was no lsn at the used position) Fixed REDO_PURGE_BLOCKS to handle any number of blocks storage/maria/ma_test1.c: Added transaction support (via maria_begin() & maria_commit()) to get a log that can be applied with maria_read_log storage/maria/ma_test2.c: Added transaction support (via maria_begin() & maria_commit()) to get a log that can be applied with maria_read_log storage/maria/ma_test_recovery: Create temporary files in maria/tmp Verify files with checksums instead of byte comparisons storage/maria/maria_chk.c: When using with -dss we only get filename, records and checksum. This is useful to do a quick comparision if a files is identical to another one. storage/maria/maria_def.h: Added ma_commit() storage/maria/maria_read_log.c: Added --help
2007-08-29 09:03:10 +03:00
DBUG_RETURN(trnman_rollback_trn(trn));
- WL#3239 "log CREATE TABLE in Maria" - WL#3240 "log DROP TABLE in Maria" - similarly, log RENAME TABLE, REPAIR/OPTIMIZE TABLE, and DELETE no_WHERE_clause (== the DELETE which just truncates the files) - create_rename_lsn added to MARIA_SHARE's state - all these operations (except DROP TABLE) also update the table's create_rename_lsn, which is needed for the correctness of Recovery (see function comment of _ma_repair_write_log_record() in ma_check.c) - write a COMMIT record when transaction commits. - don't log REDOs/UNDOs if this is an internal temporary table like inside ALTER TABLE (I expect this to be a big win). There was already no logging for user-created "CREATE TEMPORARY" tables. - don't fsync files/directories if the table is not transactional - in translog_write_record(), autogenerate a 2-byte-id for the table and log the "id->name" pair (LOGREC_FILE_ID); log LOGREC_LONG_TRANSACTION_ID; automatically store the table's 2-byte-id in any log record. - preparations for Checkpoint: translog_get_horizon(); pausing Checkpoint when some dirty pages are unknown; capturing trn->rec_lsn, trn->first_undo_lsn for Checkpoint and log's low-water-mark computing. - assertions, comments. storage/maria/Makefile.am: more files to build storage/maria/ha_maria.cc: - logging a REPAIR log record if REPAIR/OPTIMIZE was successful. - ha_maria::data_file_type does not have to be set in every info() call, just do it once in open(). - if caller said that transactionality can be disabled (like if caller is ALTER TABLE) i.e. thd->transaction.on==FALSE, then we temporarily disable transactionality of the table in external_lock(); that will ensure that no REDOs/UNDOs are logged for this possibly massive write operation (they are not needed, as if any write fails, the table will be dropped). We re-enable in external_lock(F_UNLCK), which in ALTER TABLE happens before the tmp table replaces the original one (which is good, as thus the final table will have a REDO RENAME and a correct create_rename_lsn). - when we commit we also have to write a log record, so trnman_commit_trn() calls become ma_commit() calls - at end of engine's initialization, we are potentially entering a multi-threaded dangerous world (clients are going to be accepted) and so some assertions of mutex-owning become enforceable, for that we set maria_multi_threaded=TRUE (see ma_control_file.c) storage/maria/ha_maria.h: new member ha_maria::save_transactional (see also ha_maria.cc) storage/maria/ma_blockrec.c: - fixing comments according to discussion with Monty - if a table is transactional but temporarily non-transactional (like in ALTER TABLE), we need to give a sensible LSN to the pages (and, if we give 0, pagecache asserts). - translog_write_record() now takes care of storing the share's 2-byte-id in the log record storage/maria/ma_blockrec.h: fixing comment according to discussion with Monty storage/maria/ma_check.c: When REPAIR/OPTIMIZE modify the data/index file, if this is a transactional table, they must sync it; if they remove files or rename files, they must sync the directory, so that everything is durable. This is just applying to REPAIR/OPTIMIZE the logic already implemented in CREATE/DROP/RENAME a few months ago. Adding a function to write a LOGREC_REPAIR_TABLE at end of REPAIR/OPTIMIZE (called only by ha_maria, not by maria_chk), and to update the table's create_rename_lsn. storage/maria/ma_close.c: fix for a future bug storage/maria/ma_control_file.c: ensuring that if Maria is running in multi-threaded mode, anybody wanting to write to the control file and update last_checkpoint_lsn/last_logno owns the log's lock. storage/maria/ma_control_file.h: see ma_control_file.c storage/maria/ma_create.c: when creating a table: - sync it and its directory only if this is a transactional table and there is a log (no point in syncing in maria_chk) - decouple the two uses of linkname/linkname_ptr (for index file and for data file) into more variables, as we need to know all links until the moment we write the LOGREC_CREATE_TABLE. - set share.data_file_type early so that _ma_initialize_data_file() knows it (Monty's bugfix so that a table always has at least a bitmap page when it is created; so data-file is not 0 bytes anymore). - log a LOGREC_CREATE_TABLE; it contains the bytes which we have just written to the index file's header. Update table's create_rename_lsn. - syncing of kfile had been bugified in a previous merge, correcting - syncing of dfile is now needed as it's not empty anymore - in _ma_initialize_data_file(), use share's block_size and not the global one. This is a gratuitous change, both variables are equal, just that I find it more future-proof to use share-bound variable rather than global one. storage/maria/ma_delete_all.c: log a LOGREC_DELETE_ALL record when doing ma_delete_all_rows(); update create_rename_lsn then. storage/maria/ma_delete_table.c: - logging LOGREC_DROP_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - we need to sync directories only if the table is transactional storage/maria/ma_extra.c: questions storage/maria/ma_init.c: when maria_end() is called, engine is not multithreaded storage/maria/ma_loghandler.c: - translog_inited has to be visible to ma_create() (see how it is used in ma_create()) - checkpoint record will be a single record, not three - no REDO for TRUNCATE (TRUNCATE calls ma_create() internally so will log a REDO_CREATE) - adding REDO for DELETE no_WHERE_clause (fast DELETE of all rows by truncating the files), REPAIR. - MY_WAIT_IF_FULL to wait&retry if a log write hits a full disk - in translog_write_record(), if MARIA_SHARE does not yet have a 2-byte-id, generate one for it and log LOGREC_FILE_ID; automatically store this short id into log records. - in translog_write_record(), if transaction has not logged its long trid, log LOGREC_LONG_TRANSACTION_ID. - For Checkpoint, we need to know the current end-of-log: adding translog_get_horizon(). - For Control File, adding an assertion that the thread owns the log's lock (control file is protected by this lock) storage/maria/ma_loghandler.h: Changes in log records (see ma_loghandler.c). new prototypes, new functions. storage/maria/ma_loghandler_lsn.h: adding a type LSN_WITH_FLAGS especially for TRN::first_undo_lsn, where the most significant byte is used for flags. storage/maria/ma_open.c: storing the create_rename_lsn in the index file's header (in the state, precisely) and retrieving it from there. storage/maria/ma_pagecache.c: - my set_if_bigger was wrong, correcting it - if the first_in_switch list is not empty, it means that changed_blocks misses some dirty pages, so Checkpoint cannot run and needs to wait. A variable missing_blocks_in_changed_list is added to tell that (should it be named missing_blocks_in_changed_blocks?) - pagecache_collect_changed_blocks_with_lsn() now also tells the minimum rec_lsn (needed for low-water mark computation). storage/maria/ma_pagecache.h: see ma_pagecache.c storage/maria/ma_panic.c: comment storage/maria/ma_range.c: comment storage/maria/ma_rename.c: - logging LOGREC_RENAME_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - update create_rename_lsn - we need to sync directories only if the table is transactional storage/maria/ma_static.c: comment storage/maria/ma_test_all.sh: - tip for Valgrind-ing ma_test_all - do "export maria_path=somepath" before calling ma_test_all, if you want to run ma_test_all out of storage/maria (useful to have parallel runs, like one normal and one Valgrind, they must not use the same tables so need to run in different directories) storage/maria/maria_def.h: - state now contains, in memory and on disk, the create_rename_lsn - share now contains a 2-byte-id storage/maria/trnman.c: preparations for Checkpoint: capture trn->rec_lsn, trn->first_undo_lsn; minimum first_undo_lsn needed to know log's low-water-mark storage/maria/trnman.h: using most significant byte of first_undo_lsn to hold miscellaneous flags, for now TRANSACTION_LOGGED_LONG_ID. dummy_transaction_object is already declared in ma_static.c. storage/maria/trnman_public.h: dummy_transaction_object was declared in all files including trnman_public.h, while in fact it's a single object. new prototype storage/maria/unittest/ma_test_loghandler-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multigroup-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multithread-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: update for new prototype storage/maria/ma_commit.c: function which wraps: - writing a LOGREC_COMMIT record (==commit on disk) - calling trnman_commit_trn() (=commit in memory) storage/maria/ma_commit.h: new header file .tree-is-private: this file is now needed to keep our tree private (don't push it to public trees). When 5.1 is merged into mysql-maria, we can abandon our maria-specific post-commit trigger; .tree_is_private will take care of keeping commit mails private. Don't push this file to public trees.
2007-06-22 14:49:37 +02:00
/*
- if COMMIT record is written before trnman_commit_trn():
if Checkpoint comes in the middle it will see trn is not committed,
then if crash, Recovery might roll back trn (if min(rec_lsn) is after
COMMIT record) and this is not an issue as
* transaction's updates were not made visible to other transactions
* "commit ok" was not sent to client
Alternatively, Recovery might commit trn (if min(rec_lsn) is before COMMIT
record), which is ok too. All in all it means that "trn committed" is not
100% equal to "COMMIT record written".
- if COMMIT record is written after trnman_commit_trn():
if crash happens between the two, trn will be rolled back which is an
issue (transaction's updates were made visible to other transactions).
So we need to go the first way.
*/
Added maria_commit() and maria_begin() to be used with external tests Now ma_test1 -M -T and ma_test2 -M -T produces readable, applyable logs Note: The .MAD file is not binary identical after applying redo compare to a an original file. (This is becasue we don't have full information which function called PURGE_REDO_BLOCKS). To verify if a file was correctly applied, we now instead compare row checksums BitKeeper/etc/ignore: added storage/maria/tmp/* include/maria.h: Added maria_commit() and maria_begin() to be used with external tests storage/maria/ha_maria.cc: Ensure maria_def. is read in C mode storage/maria/ma_blockrec.c: Fixed redo handling. _ma_apply_redo_purge_blocks() updated to handle any number of purged blocks Removed code to make data file idenitcal after redo (can't easily be done). See changeset comments Now ma_test1 -M -T and ma_test2 -M -T produces readable, applyable logs storage/maria/ma_commit.c: More DBUG statements Moved variable declaration to start of function (portability fix) Added helper functions 'maria_commit()' and 'maria_begin()' storage/maria/ma_loghandler.c: Fixed wrong REDO_PURGE_BLOCKS initialization storage/maria/ma_recovery.c: Added UNDO_ROW_UPDATE Removed wrong setting of lsn (there was no lsn at the used position) Fixed REDO_PURGE_BLOCKS to handle any number of blocks storage/maria/ma_test1.c: Added transaction support (via maria_begin() & maria_commit()) to get a log that can be applied with maria_read_log storage/maria/ma_test2.c: Added transaction support (via maria_begin() & maria_commit()) to get a log that can be applied with maria_read_log storage/maria/ma_test_recovery: Create temporary files in maria/tmp Verify files with checksums instead of byte comparisons storage/maria/maria_chk.c: When using with -dss we only get filename, records and checksum. This is useful to do a quick comparision if a files is identical to another one. storage/maria/maria_def.h: Added ma_commit() storage/maria/maria_read_log.c: Added --help
2007-08-29 09:03:10 +03:00
- WL#3239 "log CREATE TABLE in Maria" - WL#3240 "log DROP TABLE in Maria" - similarly, log RENAME TABLE, REPAIR/OPTIMIZE TABLE, and DELETE no_WHERE_clause (== the DELETE which just truncates the files) - create_rename_lsn added to MARIA_SHARE's state - all these operations (except DROP TABLE) also update the table's create_rename_lsn, which is needed for the correctness of Recovery (see function comment of _ma_repair_write_log_record() in ma_check.c) - write a COMMIT record when transaction commits. - don't log REDOs/UNDOs if this is an internal temporary table like inside ALTER TABLE (I expect this to be a big win). There was already no logging for user-created "CREATE TEMPORARY" tables. - don't fsync files/directories if the table is not transactional - in translog_write_record(), autogenerate a 2-byte-id for the table and log the "id->name" pair (LOGREC_FILE_ID); log LOGREC_LONG_TRANSACTION_ID; automatically store the table's 2-byte-id in any log record. - preparations for Checkpoint: translog_get_horizon(); pausing Checkpoint when some dirty pages are unknown; capturing trn->rec_lsn, trn->first_undo_lsn for Checkpoint and log's low-water-mark computing. - assertions, comments. storage/maria/Makefile.am: more files to build storage/maria/ha_maria.cc: - logging a REPAIR log record if REPAIR/OPTIMIZE was successful. - ha_maria::data_file_type does not have to be set in every info() call, just do it once in open(). - if caller said that transactionality can be disabled (like if caller is ALTER TABLE) i.e. thd->transaction.on==FALSE, then we temporarily disable transactionality of the table in external_lock(); that will ensure that no REDOs/UNDOs are logged for this possibly massive write operation (they are not needed, as if any write fails, the table will be dropped). We re-enable in external_lock(F_UNLCK), which in ALTER TABLE happens before the tmp table replaces the original one (which is good, as thus the final table will have a REDO RENAME and a correct create_rename_lsn). - when we commit we also have to write a log record, so trnman_commit_trn() calls become ma_commit() calls - at end of engine's initialization, we are potentially entering a multi-threaded dangerous world (clients are going to be accepted) and so some assertions of mutex-owning become enforceable, for that we set maria_multi_threaded=TRUE (see ma_control_file.c) storage/maria/ha_maria.h: new member ha_maria::save_transactional (see also ha_maria.cc) storage/maria/ma_blockrec.c: - fixing comments according to discussion with Monty - if a table is transactional but temporarily non-transactional (like in ALTER TABLE), we need to give a sensible LSN to the pages (and, if we give 0, pagecache asserts). - translog_write_record() now takes care of storing the share's 2-byte-id in the log record storage/maria/ma_blockrec.h: fixing comment according to discussion with Monty storage/maria/ma_check.c: When REPAIR/OPTIMIZE modify the data/index file, if this is a transactional table, they must sync it; if they remove files or rename files, they must sync the directory, so that everything is durable. This is just applying to REPAIR/OPTIMIZE the logic already implemented in CREATE/DROP/RENAME a few months ago. Adding a function to write a LOGREC_REPAIR_TABLE at end of REPAIR/OPTIMIZE (called only by ha_maria, not by maria_chk), and to update the table's create_rename_lsn. storage/maria/ma_close.c: fix for a future bug storage/maria/ma_control_file.c: ensuring that if Maria is running in multi-threaded mode, anybody wanting to write to the control file and update last_checkpoint_lsn/last_logno owns the log's lock. storage/maria/ma_control_file.h: see ma_control_file.c storage/maria/ma_create.c: when creating a table: - sync it and its directory only if this is a transactional table and there is a log (no point in syncing in maria_chk) - decouple the two uses of linkname/linkname_ptr (for index file and for data file) into more variables, as we need to know all links until the moment we write the LOGREC_CREATE_TABLE. - set share.data_file_type early so that _ma_initialize_data_file() knows it (Monty's bugfix so that a table always has at least a bitmap page when it is created; so data-file is not 0 bytes anymore). - log a LOGREC_CREATE_TABLE; it contains the bytes which we have just written to the index file's header. Update table's create_rename_lsn. - syncing of kfile had been bugified in a previous merge, correcting - syncing of dfile is now needed as it's not empty anymore - in _ma_initialize_data_file(), use share's block_size and not the global one. This is a gratuitous change, both variables are equal, just that I find it more future-proof to use share-bound variable rather than global one. storage/maria/ma_delete_all.c: log a LOGREC_DELETE_ALL record when doing ma_delete_all_rows(); update create_rename_lsn then. storage/maria/ma_delete_table.c: - logging LOGREC_DROP_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - we need to sync directories only if the table is transactional storage/maria/ma_extra.c: questions storage/maria/ma_init.c: when maria_end() is called, engine is not multithreaded storage/maria/ma_loghandler.c: - translog_inited has to be visible to ma_create() (see how it is used in ma_create()) - checkpoint record will be a single record, not three - no REDO for TRUNCATE (TRUNCATE calls ma_create() internally so will log a REDO_CREATE) - adding REDO for DELETE no_WHERE_clause (fast DELETE of all rows by truncating the files), REPAIR. - MY_WAIT_IF_FULL to wait&retry if a log write hits a full disk - in translog_write_record(), if MARIA_SHARE does not yet have a 2-byte-id, generate one for it and log LOGREC_FILE_ID; automatically store this short id into log records. - in translog_write_record(), if transaction has not logged its long trid, log LOGREC_LONG_TRANSACTION_ID. - For Checkpoint, we need to know the current end-of-log: adding translog_get_horizon(). - For Control File, adding an assertion that the thread owns the log's lock (control file is protected by this lock) storage/maria/ma_loghandler.h: Changes in log records (see ma_loghandler.c). new prototypes, new functions. storage/maria/ma_loghandler_lsn.h: adding a type LSN_WITH_FLAGS especially for TRN::first_undo_lsn, where the most significant byte is used for flags. storage/maria/ma_open.c: storing the create_rename_lsn in the index file's header (in the state, precisely) and retrieving it from there. storage/maria/ma_pagecache.c: - my set_if_bigger was wrong, correcting it - if the first_in_switch list is not empty, it means that changed_blocks misses some dirty pages, so Checkpoint cannot run and needs to wait. A variable missing_blocks_in_changed_list is added to tell that (should it be named missing_blocks_in_changed_blocks?) - pagecache_collect_changed_blocks_with_lsn() now also tells the minimum rec_lsn (needed for low-water mark computation). storage/maria/ma_pagecache.h: see ma_pagecache.c storage/maria/ma_panic.c: comment storage/maria/ma_range.c: comment storage/maria/ma_rename.c: - logging LOGREC_RENAME_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - update create_rename_lsn - we need to sync directories only if the table is transactional storage/maria/ma_static.c: comment storage/maria/ma_test_all.sh: - tip for Valgrind-ing ma_test_all - do "export maria_path=somepath" before calling ma_test_all, if you want to run ma_test_all out of storage/maria (useful to have parallel runs, like one normal and one Valgrind, they must not use the same tables so need to run in different directories) storage/maria/maria_def.h: - state now contains, in memory and on disk, the create_rename_lsn - share now contains a 2-byte-id storage/maria/trnman.c: preparations for Checkpoint: capture trn->rec_lsn, trn->first_undo_lsn; minimum first_undo_lsn needed to know log's low-water-mark storage/maria/trnman.h: using most significant byte of first_undo_lsn to hold miscellaneous flags, for now TRANSACTION_LOGGED_LONG_ID. dummy_transaction_object is already declared in ma_static.c. storage/maria/trnman_public.h: dummy_transaction_object was declared in all files including trnman_public.h, while in fact it's a single object. new prototype storage/maria/unittest/ma_test_loghandler-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multigroup-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multithread-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: update for new prototype storage/maria/ma_commit.c: function which wraps: - writing a LOGREC_COMMIT record (==commit on disk) - calling trnman_commit_trn() (=commit in memory) storage/maria/ma_commit.h: new header file .tree-is-private: this file is now needed to keep our tree private (don't push it to public trees). When 5.1 is merged into mysql-maria, we can abandon our maria-specific post-commit trigger; .tree_is_private will take care of keeping commit mails private. Don't push this file to public trees.
2007-06-22 14:49:37 +02:00
/**
@todo RECOVERY share's state is written to disk only in
maria_lock_database(), so COMMIT record is not the last record of the
transaction! It is probably an issue. Recovery of the state is a problem
not yet solved.
*/
/*
We do not store "thd->transaction.xid_state.xid" for now, it will be
needed only when we support XA.
*/
Added maria_commit() and maria_begin() to be used with external tests Now ma_test1 -M -T and ma_test2 -M -T produces readable, applyable logs Note: The .MAD file is not binary identical after applying redo compare to a an original file. (This is becasue we don't have full information which function called PURGE_REDO_BLOCKS). To verify if a file was correctly applied, we now instead compare row checksums BitKeeper/etc/ignore: added storage/maria/tmp/* include/maria.h: Added maria_commit() and maria_begin() to be used with external tests storage/maria/ha_maria.cc: Ensure maria_def. is read in C mode storage/maria/ma_blockrec.c: Fixed redo handling. _ma_apply_redo_purge_blocks() updated to handle any number of purged blocks Removed code to make data file idenitcal after redo (can't easily be done). See changeset comments Now ma_test1 -M -T and ma_test2 -M -T produces readable, applyable logs storage/maria/ma_commit.c: More DBUG statements Moved variable declaration to start of function (portability fix) Added helper functions 'maria_commit()' and 'maria_begin()' storage/maria/ma_loghandler.c: Fixed wrong REDO_PURGE_BLOCKS initialization storage/maria/ma_recovery.c: Added UNDO_ROW_UPDATE Removed wrong setting of lsn (there was no lsn at the used position) Fixed REDO_PURGE_BLOCKS to handle any number of blocks storage/maria/ma_test1.c: Added transaction support (via maria_begin() & maria_commit()) to get a log that can be applied with maria_read_log storage/maria/ma_test2.c: Added transaction support (via maria_begin() & maria_commit()) to get a log that can be applied with maria_read_log storage/maria/ma_test_recovery: Create temporary files in maria/tmp Verify files with checksums instead of byte comparisons storage/maria/maria_chk.c: When using with -dss we only get filename, records and checksum. This is useful to do a quick comparision if a files is identical to another one. storage/maria/maria_def.h: Added ma_commit() storage/maria/maria_read_log.c: Added --help
2007-08-29 09:03:10 +03:00
res= (translog_write_record(&commit_lsn, LOGREC_COMMIT,
trn, NULL, 0,
sizeof(log_array)/sizeof(log_array[0]),
log_array, NULL) ||
translog_flush(commit_lsn) ||
trnman_commit_trn(trn));
- WL#3239 "log CREATE TABLE in Maria" - WL#3240 "log DROP TABLE in Maria" - similarly, log RENAME TABLE, REPAIR/OPTIMIZE TABLE, and DELETE no_WHERE_clause (== the DELETE which just truncates the files) - create_rename_lsn added to MARIA_SHARE's state - all these operations (except DROP TABLE) also update the table's create_rename_lsn, which is needed for the correctness of Recovery (see function comment of _ma_repair_write_log_record() in ma_check.c) - write a COMMIT record when transaction commits. - don't log REDOs/UNDOs if this is an internal temporary table like inside ALTER TABLE (I expect this to be a big win). There was already no logging for user-created "CREATE TEMPORARY" tables. - don't fsync files/directories if the table is not transactional - in translog_write_record(), autogenerate a 2-byte-id for the table and log the "id->name" pair (LOGREC_FILE_ID); log LOGREC_LONG_TRANSACTION_ID; automatically store the table's 2-byte-id in any log record. - preparations for Checkpoint: translog_get_horizon(); pausing Checkpoint when some dirty pages are unknown; capturing trn->rec_lsn, trn->first_undo_lsn for Checkpoint and log's low-water-mark computing. - assertions, comments. storage/maria/Makefile.am: more files to build storage/maria/ha_maria.cc: - logging a REPAIR log record if REPAIR/OPTIMIZE was successful. - ha_maria::data_file_type does not have to be set in every info() call, just do it once in open(). - if caller said that transactionality can be disabled (like if caller is ALTER TABLE) i.e. thd->transaction.on==FALSE, then we temporarily disable transactionality of the table in external_lock(); that will ensure that no REDOs/UNDOs are logged for this possibly massive write operation (they are not needed, as if any write fails, the table will be dropped). We re-enable in external_lock(F_UNLCK), which in ALTER TABLE happens before the tmp table replaces the original one (which is good, as thus the final table will have a REDO RENAME and a correct create_rename_lsn). - when we commit we also have to write a log record, so trnman_commit_trn() calls become ma_commit() calls - at end of engine's initialization, we are potentially entering a multi-threaded dangerous world (clients are going to be accepted) and so some assertions of mutex-owning become enforceable, for that we set maria_multi_threaded=TRUE (see ma_control_file.c) storage/maria/ha_maria.h: new member ha_maria::save_transactional (see also ha_maria.cc) storage/maria/ma_blockrec.c: - fixing comments according to discussion with Monty - if a table is transactional but temporarily non-transactional (like in ALTER TABLE), we need to give a sensible LSN to the pages (and, if we give 0, pagecache asserts). - translog_write_record() now takes care of storing the share's 2-byte-id in the log record storage/maria/ma_blockrec.h: fixing comment according to discussion with Monty storage/maria/ma_check.c: When REPAIR/OPTIMIZE modify the data/index file, if this is a transactional table, they must sync it; if they remove files or rename files, they must sync the directory, so that everything is durable. This is just applying to REPAIR/OPTIMIZE the logic already implemented in CREATE/DROP/RENAME a few months ago. Adding a function to write a LOGREC_REPAIR_TABLE at end of REPAIR/OPTIMIZE (called only by ha_maria, not by maria_chk), and to update the table's create_rename_lsn. storage/maria/ma_close.c: fix for a future bug storage/maria/ma_control_file.c: ensuring that if Maria is running in multi-threaded mode, anybody wanting to write to the control file and update last_checkpoint_lsn/last_logno owns the log's lock. storage/maria/ma_control_file.h: see ma_control_file.c storage/maria/ma_create.c: when creating a table: - sync it and its directory only if this is a transactional table and there is a log (no point in syncing in maria_chk) - decouple the two uses of linkname/linkname_ptr (for index file and for data file) into more variables, as we need to know all links until the moment we write the LOGREC_CREATE_TABLE. - set share.data_file_type early so that _ma_initialize_data_file() knows it (Monty's bugfix so that a table always has at least a bitmap page when it is created; so data-file is not 0 bytes anymore). - log a LOGREC_CREATE_TABLE; it contains the bytes which we have just written to the index file's header. Update table's create_rename_lsn. - syncing of kfile had been bugified in a previous merge, correcting - syncing of dfile is now needed as it's not empty anymore - in _ma_initialize_data_file(), use share's block_size and not the global one. This is a gratuitous change, both variables are equal, just that I find it more future-proof to use share-bound variable rather than global one. storage/maria/ma_delete_all.c: log a LOGREC_DELETE_ALL record when doing ma_delete_all_rows(); update create_rename_lsn then. storage/maria/ma_delete_table.c: - logging LOGREC_DROP_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - we need to sync directories only if the table is transactional storage/maria/ma_extra.c: questions storage/maria/ma_init.c: when maria_end() is called, engine is not multithreaded storage/maria/ma_loghandler.c: - translog_inited has to be visible to ma_create() (see how it is used in ma_create()) - checkpoint record will be a single record, not three - no REDO for TRUNCATE (TRUNCATE calls ma_create() internally so will log a REDO_CREATE) - adding REDO for DELETE no_WHERE_clause (fast DELETE of all rows by truncating the files), REPAIR. - MY_WAIT_IF_FULL to wait&retry if a log write hits a full disk - in translog_write_record(), if MARIA_SHARE does not yet have a 2-byte-id, generate one for it and log LOGREC_FILE_ID; automatically store this short id into log records. - in translog_write_record(), if transaction has not logged its long trid, log LOGREC_LONG_TRANSACTION_ID. - For Checkpoint, we need to know the current end-of-log: adding translog_get_horizon(). - For Control File, adding an assertion that the thread owns the log's lock (control file is protected by this lock) storage/maria/ma_loghandler.h: Changes in log records (see ma_loghandler.c). new prototypes, new functions. storage/maria/ma_loghandler_lsn.h: adding a type LSN_WITH_FLAGS especially for TRN::first_undo_lsn, where the most significant byte is used for flags. storage/maria/ma_open.c: storing the create_rename_lsn in the index file's header (in the state, precisely) and retrieving it from there. storage/maria/ma_pagecache.c: - my set_if_bigger was wrong, correcting it - if the first_in_switch list is not empty, it means that changed_blocks misses some dirty pages, so Checkpoint cannot run and needs to wait. A variable missing_blocks_in_changed_list is added to tell that (should it be named missing_blocks_in_changed_blocks?) - pagecache_collect_changed_blocks_with_lsn() now also tells the minimum rec_lsn (needed for low-water mark computation). storage/maria/ma_pagecache.h: see ma_pagecache.c storage/maria/ma_panic.c: comment storage/maria/ma_range.c: comment storage/maria/ma_rename.c: - logging LOGREC_RENAME_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - update create_rename_lsn - we need to sync directories only if the table is transactional storage/maria/ma_static.c: comment storage/maria/ma_test_all.sh: - tip for Valgrind-ing ma_test_all - do "export maria_path=somepath" before calling ma_test_all, if you want to run ma_test_all out of storage/maria (useful to have parallel runs, like one normal and one Valgrind, they must not use the same tables so need to run in different directories) storage/maria/maria_def.h: - state now contains, in memory and on disk, the create_rename_lsn - share now contains a 2-byte-id storage/maria/trnman.c: preparations for Checkpoint: capture trn->rec_lsn, trn->first_undo_lsn; minimum first_undo_lsn needed to know log's low-water-mark storage/maria/trnman.h: using most significant byte of first_undo_lsn to hold miscellaneous flags, for now TRANSACTION_LOGGED_LONG_ID. dummy_transaction_object is already declared in ma_static.c. storage/maria/trnman_public.h: dummy_transaction_object was declared in all files including trnman_public.h, while in fact it's a single object. new prototype storage/maria/unittest/ma_test_loghandler-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multigroup-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multithread-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: update for new prototype storage/maria/ma_commit.c: function which wraps: - writing a LOGREC_COMMIT record (==commit on disk) - calling trnman_commit_trn() (=commit in memory) storage/maria/ma_commit.h: new header file .tree-is-private: this file is now needed to keep our tree private (don't push it to public trees). When 5.1 is merged into mysql-maria, we can abandon our maria-specific post-commit trigger; .tree_is_private will take care of keeping commit mails private. Don't push this file to public trees.
2007-06-22 14:49:37 +02:00
/*
Note: if trnman_commit_trn() fails above, we have already
written the COMMIT record, so Checkpoint and Recovery will see the
transaction as committed.
*/
Added maria_commit() and maria_begin() to be used with external tests Now ma_test1 -M -T and ma_test2 -M -T produces readable, applyable logs Note: The .MAD file is not binary identical after applying redo compare to a an original file. (This is becasue we don't have full information which function called PURGE_REDO_BLOCKS). To verify if a file was correctly applied, we now instead compare row checksums BitKeeper/etc/ignore: added storage/maria/tmp/* include/maria.h: Added maria_commit() and maria_begin() to be used with external tests storage/maria/ha_maria.cc: Ensure maria_def. is read in C mode storage/maria/ma_blockrec.c: Fixed redo handling. _ma_apply_redo_purge_blocks() updated to handle any number of purged blocks Removed code to make data file idenitcal after redo (can't easily be done). See changeset comments Now ma_test1 -M -T and ma_test2 -M -T produces readable, applyable logs storage/maria/ma_commit.c: More DBUG statements Moved variable declaration to start of function (portability fix) Added helper functions 'maria_commit()' and 'maria_begin()' storage/maria/ma_loghandler.c: Fixed wrong REDO_PURGE_BLOCKS initialization storage/maria/ma_recovery.c: Added UNDO_ROW_UPDATE Removed wrong setting of lsn (there was no lsn at the used position) Fixed REDO_PURGE_BLOCKS to handle any number of blocks storage/maria/ma_test1.c: Added transaction support (via maria_begin() & maria_commit()) to get a log that can be applied with maria_read_log storage/maria/ma_test2.c: Added transaction support (via maria_begin() & maria_commit()) to get a log that can be applied with maria_read_log storage/maria/ma_test_recovery: Create temporary files in maria/tmp Verify files with checksums instead of byte comparisons storage/maria/maria_chk.c: When using with -dss we only get filename, records and checksum. This is useful to do a quick comparision if a files is identical to another one. storage/maria/maria_def.h: Added ma_commit() storage/maria/maria_read_log.c: Added --help
2007-08-29 09:03:10 +03:00
DBUG_RETURN(res);
}
/**
@brief Writes a COMMIT record for a transaciton associated with a file
@param info Maria handler
@return Operation status
@retval 0 ok
@retval # error (disk error or out of memory)
*/
int maria_commit(MARIA_HA *info)
{
return info->s->now_transactional ? ma_commit(info->trn) : 0;
}
/**
@brief Starts a transaction on a file handle
@param info Maria handler
@return Operation status
@retval 0 ok
@retval # Error code.
*/
int maria_begin(MARIA_HA *info)
{
DBUG_ENTER("maria_begin");
if (info->s->now_transactional)
{
TRN *trn;
struct st_my_thread_var *mysys_var= my_thread_var;
trn= trnman_new_trn(&mysys_var->mutex,
&mysys_var->suspend,
(char*) &mysys_var + STACK_DIRECTION *1024*128);
if (unlikely(!trn))
DBUG_RETURN(HA_ERR_OUT_OF_MEM);
DBUG_PRINT("info", ("TRN set to 0x%lx", (ulong) trn));
info->trn= trn;
}
DBUG_RETURN(0);
- WL#3239 "log CREATE TABLE in Maria" - WL#3240 "log DROP TABLE in Maria" - similarly, log RENAME TABLE, REPAIR/OPTIMIZE TABLE, and DELETE no_WHERE_clause (== the DELETE which just truncates the files) - create_rename_lsn added to MARIA_SHARE's state - all these operations (except DROP TABLE) also update the table's create_rename_lsn, which is needed for the correctness of Recovery (see function comment of _ma_repair_write_log_record() in ma_check.c) - write a COMMIT record when transaction commits. - don't log REDOs/UNDOs if this is an internal temporary table like inside ALTER TABLE (I expect this to be a big win). There was already no logging for user-created "CREATE TEMPORARY" tables. - don't fsync files/directories if the table is not transactional - in translog_write_record(), autogenerate a 2-byte-id for the table and log the "id->name" pair (LOGREC_FILE_ID); log LOGREC_LONG_TRANSACTION_ID; automatically store the table's 2-byte-id in any log record. - preparations for Checkpoint: translog_get_horizon(); pausing Checkpoint when some dirty pages are unknown; capturing trn->rec_lsn, trn->first_undo_lsn for Checkpoint and log's low-water-mark computing. - assertions, comments. storage/maria/Makefile.am: more files to build storage/maria/ha_maria.cc: - logging a REPAIR log record if REPAIR/OPTIMIZE was successful. - ha_maria::data_file_type does not have to be set in every info() call, just do it once in open(). - if caller said that transactionality can be disabled (like if caller is ALTER TABLE) i.e. thd->transaction.on==FALSE, then we temporarily disable transactionality of the table in external_lock(); that will ensure that no REDOs/UNDOs are logged for this possibly massive write operation (they are not needed, as if any write fails, the table will be dropped). We re-enable in external_lock(F_UNLCK), which in ALTER TABLE happens before the tmp table replaces the original one (which is good, as thus the final table will have a REDO RENAME and a correct create_rename_lsn). - when we commit we also have to write a log record, so trnman_commit_trn() calls become ma_commit() calls - at end of engine's initialization, we are potentially entering a multi-threaded dangerous world (clients are going to be accepted) and so some assertions of mutex-owning become enforceable, for that we set maria_multi_threaded=TRUE (see ma_control_file.c) storage/maria/ha_maria.h: new member ha_maria::save_transactional (see also ha_maria.cc) storage/maria/ma_blockrec.c: - fixing comments according to discussion with Monty - if a table is transactional but temporarily non-transactional (like in ALTER TABLE), we need to give a sensible LSN to the pages (and, if we give 0, pagecache asserts). - translog_write_record() now takes care of storing the share's 2-byte-id in the log record storage/maria/ma_blockrec.h: fixing comment according to discussion with Monty storage/maria/ma_check.c: When REPAIR/OPTIMIZE modify the data/index file, if this is a transactional table, they must sync it; if they remove files or rename files, they must sync the directory, so that everything is durable. This is just applying to REPAIR/OPTIMIZE the logic already implemented in CREATE/DROP/RENAME a few months ago. Adding a function to write a LOGREC_REPAIR_TABLE at end of REPAIR/OPTIMIZE (called only by ha_maria, not by maria_chk), and to update the table's create_rename_lsn. storage/maria/ma_close.c: fix for a future bug storage/maria/ma_control_file.c: ensuring that if Maria is running in multi-threaded mode, anybody wanting to write to the control file and update last_checkpoint_lsn/last_logno owns the log's lock. storage/maria/ma_control_file.h: see ma_control_file.c storage/maria/ma_create.c: when creating a table: - sync it and its directory only if this is a transactional table and there is a log (no point in syncing in maria_chk) - decouple the two uses of linkname/linkname_ptr (for index file and for data file) into more variables, as we need to know all links until the moment we write the LOGREC_CREATE_TABLE. - set share.data_file_type early so that _ma_initialize_data_file() knows it (Monty's bugfix so that a table always has at least a bitmap page when it is created; so data-file is not 0 bytes anymore). - log a LOGREC_CREATE_TABLE; it contains the bytes which we have just written to the index file's header. Update table's create_rename_lsn. - syncing of kfile had been bugified in a previous merge, correcting - syncing of dfile is now needed as it's not empty anymore - in _ma_initialize_data_file(), use share's block_size and not the global one. This is a gratuitous change, both variables are equal, just that I find it more future-proof to use share-bound variable rather than global one. storage/maria/ma_delete_all.c: log a LOGREC_DELETE_ALL record when doing ma_delete_all_rows(); update create_rename_lsn then. storage/maria/ma_delete_table.c: - logging LOGREC_DROP_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - we need to sync directories only if the table is transactional storage/maria/ma_extra.c: questions storage/maria/ma_init.c: when maria_end() is called, engine is not multithreaded storage/maria/ma_loghandler.c: - translog_inited has to be visible to ma_create() (see how it is used in ma_create()) - checkpoint record will be a single record, not three - no REDO for TRUNCATE (TRUNCATE calls ma_create() internally so will log a REDO_CREATE) - adding REDO for DELETE no_WHERE_clause (fast DELETE of all rows by truncating the files), REPAIR. - MY_WAIT_IF_FULL to wait&retry if a log write hits a full disk - in translog_write_record(), if MARIA_SHARE does not yet have a 2-byte-id, generate one for it and log LOGREC_FILE_ID; automatically store this short id into log records. - in translog_write_record(), if transaction has not logged its long trid, log LOGREC_LONG_TRANSACTION_ID. - For Checkpoint, we need to know the current end-of-log: adding translog_get_horizon(). - For Control File, adding an assertion that the thread owns the log's lock (control file is protected by this lock) storage/maria/ma_loghandler.h: Changes in log records (see ma_loghandler.c). new prototypes, new functions. storage/maria/ma_loghandler_lsn.h: adding a type LSN_WITH_FLAGS especially for TRN::first_undo_lsn, where the most significant byte is used for flags. storage/maria/ma_open.c: storing the create_rename_lsn in the index file's header (in the state, precisely) and retrieving it from there. storage/maria/ma_pagecache.c: - my set_if_bigger was wrong, correcting it - if the first_in_switch list is not empty, it means that changed_blocks misses some dirty pages, so Checkpoint cannot run and needs to wait. A variable missing_blocks_in_changed_list is added to tell that (should it be named missing_blocks_in_changed_blocks?) - pagecache_collect_changed_blocks_with_lsn() now also tells the minimum rec_lsn (needed for low-water mark computation). storage/maria/ma_pagecache.h: see ma_pagecache.c storage/maria/ma_panic.c: comment storage/maria/ma_range.c: comment storage/maria/ma_rename.c: - logging LOGREC_RENAME_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - update create_rename_lsn - we need to sync directories only if the table is transactional storage/maria/ma_static.c: comment storage/maria/ma_test_all.sh: - tip for Valgrind-ing ma_test_all - do "export maria_path=somepath" before calling ma_test_all, if you want to run ma_test_all out of storage/maria (useful to have parallel runs, like one normal and one Valgrind, they must not use the same tables so need to run in different directories) storage/maria/maria_def.h: - state now contains, in memory and on disk, the create_rename_lsn - share now contains a 2-byte-id storage/maria/trnman.c: preparations for Checkpoint: capture trn->rec_lsn, trn->first_undo_lsn; minimum first_undo_lsn needed to know log's low-water-mark storage/maria/trnman.h: using most significant byte of first_undo_lsn to hold miscellaneous flags, for now TRANSACTION_LOGGED_LONG_ID. dummy_transaction_object is already declared in ma_static.c. storage/maria/trnman_public.h: dummy_transaction_object was declared in all files including trnman_public.h, while in fact it's a single object. new prototype storage/maria/unittest/ma_test_loghandler-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multigroup-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multithread-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: update for new prototype storage/maria/ma_commit.c: function which wraps: - writing a LOGREC_COMMIT record (==commit on disk) - calling trnman_commit_trn() (=commit in memory) storage/maria/ma_commit.h: new header file .tree-is-private: this file is now needed to keep our tree private (don't push it to public trees). When 5.1 is merged into mysql-maria, we can abandon our maria-specific post-commit trigger; .tree_is_private will take care of keeping commit mails private. Don't push this file to public trees.
2007-06-22 14:49:37 +02:00
}