2007-06-26 22:30:09 +02:00
|
|
|
/* Copyright (C) 2007 MySQL AB & Sanja Belkin
|
|
|
|
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
|
|
it under the terms of the GNU General Public License as published by
|
|
|
|
the Free Software Foundation; version 2 of the License.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
along with this program; if not, write to the Free Software
|
|
|
|
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
|
|
|
|
|
2007-02-02 08:41:32 +01:00
|
|
|
#ifndef _ma_loghandler_lsn_h
|
|
|
|
#define _ma_loghandler_lsn_h
|
|
|
|
|
2007-02-27 09:59:01 +01:00
|
|
|
/*
|
|
|
|
Transaction log record address:
|
|
|
|
file_no << 32 | offset
|
|
|
|
file_no is only 3 bytes so we can use signed integer to make
|
|
|
|
comparison more simple.
|
|
|
|
*/
|
2007-02-12 13:23:43 +01:00
|
|
|
typedef int64 TRANSLOG_ADDRESS;
|
2007-02-02 08:41:32 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
Compare addresses
|
|
|
|
A1 > A2 -> result > 0
|
|
|
|
A1 == A2 -> 0
|
|
|
|
A1 < A2 -> result < 0
|
|
|
|
*/
|
2007-02-12 13:23:43 +01:00
|
|
|
#define cmp_translog_addr(A1,A2) ((A1) - (A2))
|
2007-02-02 08:41:32 +01:00
|
|
|
|
|
|
|
/* LSN type (address of certain log record chank */
|
|
|
|
typedef TRANSLOG_ADDRESS LSN;
|
|
|
|
|
2007-02-12 13:23:43 +01:00
|
|
|
/* Gets file number part of a LSN/log address */
|
|
|
|
#define LSN_FILE_NO(L) ((L) >> 32)
|
|
|
|
|
|
|
|
/* Gets raw file number part of a LSN/log address */
|
WL#3072 Maria Recovery
- new program maria_read_log to display and apply log records
found in a Maria log (see file's revision comment)
- minor, misc fixes
storage/maria/Makefile.am:
new program maria_read_log
storage/maria/ha_maria.cc:
create control file if missing
storage/maria/ma_blockrec.c:
0 -> LSN_IMPOSSIBLE; comments
storage/maria/ma_checkpoint.h:
preparations for Checkpoint module
storage/maria/ma_close.c:
comment
storage/maria/ma_control_file.c:
renaming constants.
Possibility to say "open control file but don't create it if it's
missing" (used by maria_read_log which does not want to create
anything)
storage/maria/ma_control_file.h:
renaming constants
storage/maria/ma_create.c:
I had duplicated "linkname" and "linkname_ptr", now I see it's not
needed, reverting. Indeed those variables don't contain interesting
information; fixing log record accordingly (the links are in
ci->data/index_file_name). Storing keystart in log record is needed,
to know at which size we must extend the file if we replay
LOGREC_CREATE_TABLE.
storage/maria/ma_loghandler.c:
some structures need to be known to maria_read_log.c, taking
them to ma_loghandler.h
storage/maria/ma_loghandler.h:
we have page_store, adding page_korr.
translog_lock() made public, because Checkpoint will need it (to
write to control file).
Some structures moved from ma_loghandler.c because maria_read_log.c
needs them (needs to know the execute-in-REDO-phase hooks of each
record).
storage/maria/ma_loghandler_lsn.h:
constants defined in ma_control_file.h serve everywhere,
and they relate to LSNs, so putting them in ma_loghandler_lsn.h.
Stronger constraints in LSN_VALID().
storage/maria/ma_pagecache.c:
renaming constants
storage/maria/ma_recovery.h:
copyright
storage/maria/ma_test1.c:
new prototype
storage/maria/ma_test2.c:
new prototype
storage/maria/trnman_public.h:
double-inclusion safe
storage/maria/unittest/ma_control_file-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler_multithread-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler_pagecache-t.c:
constants renamed, new prototype
storage/myisam/mi_close.c:
comment
storage/maria/maria_read_log.c:
program to read and print log records from a Maria transaction log,
and optionally apply them to tables. Very basic, early version.
Should serve as a base for Recovery's code. Designed to be idempotent.
Create a log by running maria.test, then cd to var/master-data
and run "maria_read_log --only-display" to see info about records;
run "maria_read_log --display-and-apply" to also apply the records
to tables (it's more interesting if you first wipe out the
tables in var/master-data/test, to see how they get re-created).
Only a few records are handled by now: LONG_TRANSACTION_ID,
COMMIT, FILE_ID, REDO_CREATE_TABLE; place is ready for
REDO_INSERT_ROW_HEAD where I could use Monty's help (search for
"Monty" in the file). Note: changes to the index pages, index's header
and bitmap pages are not properly logged yet, so don't expect
the program to work with that.
2007-06-26 16:49:23 +02:00
|
|
|
#define LSN_FILE_NO_PART(L) ((L) & ((int64)0xFFFFFF00000000LL))
|
2007-02-12 13:23:43 +01:00
|
|
|
|
|
|
|
/* Gets record offset of a LSN/log address */
|
|
|
|
#define LSN_OFFSET(L) ((L) & 0xFFFFFFFFL)
|
|
|
|
|
|
|
|
/* Makes lsn/log address from file number and record offset */
|
2007-07-03 23:50:17 +02:00
|
|
|
#define MAKE_LSN(F,S) ((LSN) ((((uint64)(F)) << 32) | (S)))
|
2007-02-12 13:23:43 +01:00
|
|
|
|
|
|
|
/* checks LSN */
|
WL#3072 Maria Recovery
- new program maria_read_log to display and apply log records
found in a Maria log (see file's revision comment)
- minor, misc fixes
storage/maria/Makefile.am:
new program maria_read_log
storage/maria/ha_maria.cc:
create control file if missing
storage/maria/ma_blockrec.c:
0 -> LSN_IMPOSSIBLE; comments
storage/maria/ma_checkpoint.h:
preparations for Checkpoint module
storage/maria/ma_close.c:
comment
storage/maria/ma_control_file.c:
renaming constants.
Possibility to say "open control file but don't create it if it's
missing" (used by maria_read_log which does not want to create
anything)
storage/maria/ma_control_file.h:
renaming constants
storage/maria/ma_create.c:
I had duplicated "linkname" and "linkname_ptr", now I see it's not
needed, reverting. Indeed those variables don't contain interesting
information; fixing log record accordingly (the links are in
ci->data/index_file_name). Storing keystart in log record is needed,
to know at which size we must extend the file if we replay
LOGREC_CREATE_TABLE.
storage/maria/ma_loghandler.c:
some structures need to be known to maria_read_log.c, taking
them to ma_loghandler.h
storage/maria/ma_loghandler.h:
we have page_store, adding page_korr.
translog_lock() made public, because Checkpoint will need it (to
write to control file).
Some structures moved from ma_loghandler.c because maria_read_log.c
needs them (needs to know the execute-in-REDO-phase hooks of each
record).
storage/maria/ma_loghandler_lsn.h:
constants defined in ma_control_file.h serve everywhere,
and they relate to LSNs, so putting them in ma_loghandler_lsn.h.
Stronger constraints in LSN_VALID().
storage/maria/ma_pagecache.c:
renaming constants
storage/maria/ma_recovery.h:
copyright
storage/maria/ma_test1.c:
new prototype
storage/maria/ma_test2.c:
new prototype
storage/maria/trnman_public.h:
double-inclusion safe
storage/maria/unittest/ma_control_file-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler_multithread-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler_pagecache-t.c:
constants renamed, new prototype
storage/myisam/mi_close.c:
comment
storage/maria/maria_read_log.c:
program to read and print log records from a Maria transaction log,
and optionally apply them to tables. Very basic, early version.
Should serve as a base for Recovery's code. Designed to be idempotent.
Create a log by running maria.test, then cd to var/master-data
and run "maria_read_log --only-display" to see info about records;
run "maria_read_log --display-and-apply" to also apply the records
to tables (it's more interesting if you first wipe out the
tables in var/master-data/test, to see how they get re-created).
Only a few records are handled by now: LONG_TRANSACTION_ID,
COMMIT, FILE_ID, REDO_CREATE_TABLE; place is ready for
REDO_INSERT_ROW_HEAD where I could use Monty's help (search for
"Monty" in the file). Note: changes to the index pages, index's header
and bitmap pages are not properly logged yet, so don't expect
the program to work with that.
2007-06-26 16:49:23 +02:00
|
|
|
#define LSN_VALID(L) \
|
|
|
|
((LSN_FILE_NO_PART(L) != FILENO_IMPOSSIBLE) && \
|
|
|
|
(LSN_OFFSET(L) != LOG_OFFSET_IMPOSSIBLE))
|
2007-02-12 13:23:43 +01:00
|
|
|
|
- WL#3239 "log CREATE TABLE in Maria"
- WL#3240 "log DROP TABLE in Maria"
- similarly, log RENAME TABLE, REPAIR/OPTIMIZE TABLE, and
DELETE no_WHERE_clause (== the DELETE which just truncates the files)
- create_rename_lsn added to MARIA_SHARE's state
- all these operations (except DROP TABLE) also update the table's
create_rename_lsn, which is needed for the correctness of
Recovery (see function comment of _ma_repair_write_log_record()
in ma_check.c)
- write a COMMIT record when transaction commits.
- don't log REDOs/UNDOs if this is an internal temporary table
like inside ALTER TABLE (I expect this to be a big win). There was
already no logging for user-created "CREATE TEMPORARY" tables.
- don't fsync files/directories if the table is not transactional
- in translog_write_record(), autogenerate a 2-byte-id for the table
and log the "id->name" pair (LOGREC_FILE_ID); log
LOGREC_LONG_TRANSACTION_ID; automatically store
the table's 2-byte-id in any log record.
- preparations for Checkpoint: translog_get_horizon(); pausing Checkpoint
when some dirty pages are unknown; capturing trn->rec_lsn,
trn->first_undo_lsn for Checkpoint and log's low-water-mark computing.
- assertions, comments.
storage/maria/Makefile.am:
more files to build
storage/maria/ha_maria.cc:
- logging a REPAIR log record if REPAIR/OPTIMIZE was successful.
- ha_maria::data_file_type does not have to be set in every info()
call, just do it once in open().
- if caller said that transactionality can be disabled (like if
caller is ALTER TABLE) i.e. thd->transaction.on==FALSE, then we
temporarily disable transactionality of the table in external_lock();
that will ensure that no REDOs/UNDOs are logged for this possibly
massive write operation (they are not needed, as if any write fails,
the table will be dropped). We re-enable in external_lock(F_UNLCK),
which in ALTER TABLE happens before the tmp table replaces the original
one (which is good, as thus the final table will have a REDO RENAME
and a correct create_rename_lsn).
- when we commit we also have to write a log record, so
trnman_commit_trn() calls become ma_commit() calls
- at end of engine's initialization, we are potentially entering a
multi-threaded dangerous world (clients are going to be accepted)
and so some assertions of mutex-owning become enforceable, for that
we set maria_multi_threaded=TRUE (see ma_control_file.c)
storage/maria/ha_maria.h:
new member ha_maria::save_transactional (see also ha_maria.cc)
storage/maria/ma_blockrec.c:
- fixing comments according to discussion with Monty
- if a table is transactional but temporarily non-transactional
(like in ALTER TABLE), we need to give a sensible LSN to the pages
(and, if we give 0, pagecache asserts).
- translog_write_record() now takes care of storing the share's
2-byte-id in the log record
storage/maria/ma_blockrec.h:
fixing comment according to discussion with Monty
storage/maria/ma_check.c:
When REPAIR/OPTIMIZE modify the data/index file, if this is a
transactional table, they must sync it; if they remove files or rename
files, they must sync the directory, so that everything is durable.
This is just applying to REPAIR/OPTIMIZE the logic already implemented
in CREATE/DROP/RENAME a few months ago.
Adding a function to write a LOGREC_REPAIR_TABLE at end of
REPAIR/OPTIMIZE (called only by ha_maria, not by maria_chk), and
to update the table's create_rename_lsn.
storage/maria/ma_close.c:
fix for a future bug
storage/maria/ma_control_file.c:
ensuring that if Maria is running in multi-threaded mode, anybody
wanting to write to the control file and update
last_checkpoint_lsn/last_logno owns the log's lock.
storage/maria/ma_control_file.h:
see ma_control_file.c
storage/maria/ma_create.c:
when creating a table:
- sync it and its directory only if this is a transactional table
and there is a log (no point in syncing in maria_chk)
- decouple the two uses of linkname/linkname_ptr (for index file and
for data file) into more variables, as we need to know all links
until the moment we write the LOGREC_CREATE_TABLE.
- set share.data_file_type early so that _ma_initialize_data_file()
knows it (Monty's bugfix so that a table always has at least a bitmap
page when it is created; so data-file is not 0 bytes anymore).
- log a LOGREC_CREATE_TABLE; it contains the bytes which we have
just written to the index file's header. Update table's
create_rename_lsn.
- syncing of kfile had been bugified in a previous merge, correcting
- syncing of dfile is now needed as it's not empty anymore
- in _ma_initialize_data_file(), use share's block_size and not the
global one. This is a gratuitous change, both variables are equal,
just that I find it more future-proof to use share-bound variable
rather than global one.
storage/maria/ma_delete_all.c:
log a LOGREC_DELETE_ALL record when doing ma_delete_all_rows();
update create_rename_lsn then.
storage/maria/ma_delete_table.c:
- logging LOGREC_DROP_TABLE; knowing if this is needed, requires
knowing if the table is transactional, which requires opening the
table.
- we need to sync directories only if the table is transactional
storage/maria/ma_extra.c:
questions
storage/maria/ma_init.c:
when maria_end() is called, engine is not multithreaded
storage/maria/ma_loghandler.c:
- translog_inited has to be visible to ma_create() (see how it is used
in ma_create())
- checkpoint record will be a single record, not three
- no REDO for TRUNCATE (TRUNCATE calls ma_create() internally so will
log a REDO_CREATE)
- adding REDO for DELETE no_WHERE_clause (fast DELETE of all rows by
truncating the files), REPAIR.
- MY_WAIT_IF_FULL to wait&retry if a log write hits a full disk
- in translog_write_record(), if MARIA_SHARE does not yet have a
2-byte-id, generate one for it and log LOGREC_FILE_ID; automatically
store this short id into log records.
- in translog_write_record(), if transaction has not logged its
long trid, log LOGREC_LONG_TRANSACTION_ID.
- For Checkpoint, we need to know the current end-of-log: adding
translog_get_horizon().
- For Control File, adding an assertion that the thread owns the
log's lock (control file is protected by this lock)
storage/maria/ma_loghandler.h:
Changes in log records (see ma_loghandler.c).
new prototypes, new functions.
storage/maria/ma_loghandler_lsn.h:
adding a type LSN_WITH_FLAGS especially for TRN::first_undo_lsn,
where the most significant byte is used for flags.
storage/maria/ma_open.c:
storing the create_rename_lsn in the index file's header (in the
state, precisely) and retrieving it from there.
storage/maria/ma_pagecache.c:
- my set_if_bigger was wrong, correcting it
- if the first_in_switch list is not empty, it means that
changed_blocks misses some dirty pages, so Checkpoint cannot run and
needs to wait. A variable missing_blocks_in_changed_list is added to
tell that (should it be named missing_blocks_in_changed_blocks?)
- pagecache_collect_changed_blocks_with_lsn() now also tells the
minimum rec_lsn (needed for low-water mark computation).
storage/maria/ma_pagecache.h:
see ma_pagecache.c
storage/maria/ma_panic.c:
comment
storage/maria/ma_range.c:
comment
storage/maria/ma_rename.c:
- logging LOGREC_RENAME_TABLE; knowing if this is needed, requires
knowing if the table is transactional, which requires opening the
table.
- update create_rename_lsn
- we need to sync directories only if the table is transactional
storage/maria/ma_static.c:
comment
storage/maria/ma_test_all.sh:
- tip for Valgrind-ing ma_test_all
- do "export maria_path=somepath" before calling ma_test_all,
if you want to run ma_test_all out of storage/maria (useful
to have parallel runs, like one normal and one Valgrind, they
must not use the same tables so need to run in different directories)
storage/maria/maria_def.h:
- state now contains, in memory and on disk, the create_rename_lsn
- share now contains a 2-byte-id
storage/maria/trnman.c:
preparations for Checkpoint: capture trn->rec_lsn, trn->first_undo_lsn;
minimum first_undo_lsn needed to know log's low-water-mark
storage/maria/trnman.h:
using most significant byte of first_undo_lsn to hold miscellaneous
flags, for now TRANSACTION_LOGGED_LONG_ID.
dummy_transaction_object is already declared in ma_static.c.
storage/maria/trnman_public.h:
dummy_transaction_object was declared in all files including
trnman_public.h, while in fact it's a single object.
new prototype
storage/maria/unittest/ma_test_loghandler-t.c:
update for new prototype
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
update for new prototype
storage/maria/unittest/ma_test_loghandler_multithread-t.c:
update for new prototype
storage/maria/unittest/ma_test_loghandler_pagecache-t.c:
update for new prototype
storage/maria/ma_commit.c:
function which wraps:
- writing a LOGREC_COMMIT record (==commit on disk)
- calling trnman_commit_trn() (=commit in memory)
storage/maria/ma_commit.h:
new header file
.tree-is-private:
this file is now needed to keep our tree private (don't push it
to public trees). When 5.1 is merged into mysql-maria, we can abandon
our maria-specific post-commit trigger; .tree_is_private will take
care of keeping commit mails private. Don't push this file to public
trees.
2007-06-22 14:49:37 +02:00
|
|
|
/* size of stored LSN on a disk, don't change it! */
|
2007-02-19 22:01:27 +01:00
|
|
|
#define LSN_STORE_SIZE 7
|
|
|
|
|
2007-02-02 08:41:32 +01:00
|
|
|
/* Puts LSN into buffer (dst) */
|
2007-02-19 22:01:27 +01:00
|
|
|
#define lsn_store(dst, lsn) \
|
2007-02-02 08:41:32 +01:00
|
|
|
do { \
|
2007-02-12 13:23:43 +01:00
|
|
|
int3store((dst), LSN_FILE_NO(lsn)); \
|
|
|
|
int4store((dst) + 3, LSN_OFFSET(lsn)); \
|
2007-02-02 08:41:32 +01:00
|
|
|
} while (0)
|
|
|
|
|
|
|
|
/* Unpacks LSN from the buffer (P) */
|
2007-02-19 22:01:27 +01:00
|
|
|
#define lsn_korr(P) MAKE_LSN(uint3korr(P), uint4korr((P) + 3))
|
2007-02-12 13:23:43 +01:00
|
|
|
|
|
|
|
/* what we need to add to LSN to increase it on one file */
|
|
|
|
#define LSN_ONE_FILE ((int64)0x100000000LL)
|
|
|
|
|
WL#3072 Maria Recovery
- new program maria_read_log to display and apply log records
found in a Maria log (see file's revision comment)
- minor, misc fixes
storage/maria/Makefile.am:
new program maria_read_log
storage/maria/ha_maria.cc:
create control file if missing
storage/maria/ma_blockrec.c:
0 -> LSN_IMPOSSIBLE; comments
storage/maria/ma_checkpoint.h:
preparations for Checkpoint module
storage/maria/ma_close.c:
comment
storage/maria/ma_control_file.c:
renaming constants.
Possibility to say "open control file but don't create it if it's
missing" (used by maria_read_log which does not want to create
anything)
storage/maria/ma_control_file.h:
renaming constants
storage/maria/ma_create.c:
I had duplicated "linkname" and "linkname_ptr", now I see it's not
needed, reverting. Indeed those variables don't contain interesting
information; fixing log record accordingly (the links are in
ci->data/index_file_name). Storing keystart in log record is needed,
to know at which size we must extend the file if we replay
LOGREC_CREATE_TABLE.
storage/maria/ma_loghandler.c:
some structures need to be known to maria_read_log.c, taking
them to ma_loghandler.h
storage/maria/ma_loghandler.h:
we have page_store, adding page_korr.
translog_lock() made public, because Checkpoint will need it (to
write to control file).
Some structures moved from ma_loghandler.c because maria_read_log.c
needs them (needs to know the execute-in-REDO-phase hooks of each
record).
storage/maria/ma_loghandler_lsn.h:
constants defined in ma_control_file.h serve everywhere,
and they relate to LSNs, so putting them in ma_loghandler_lsn.h.
Stronger constraints in LSN_VALID().
storage/maria/ma_pagecache.c:
renaming constants
storage/maria/ma_recovery.h:
copyright
storage/maria/ma_test1.c:
new prototype
storage/maria/ma_test2.c:
new prototype
storage/maria/trnman_public.h:
double-inclusion safe
storage/maria/unittest/ma_control_file-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler_multithread-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler_pagecache-t.c:
constants renamed, new prototype
storage/myisam/mi_close.c:
comment
storage/maria/maria_read_log.c:
program to read and print log records from a Maria transaction log,
and optionally apply them to tables. Very basic, early version.
Should serve as a base for Recovery's code. Designed to be idempotent.
Create a log by running maria.test, then cd to var/master-data
and run "maria_read_log --only-display" to see info about records;
run "maria_read_log --display-and-apply" to also apply the records
to tables (it's more interesting if you first wipe out the
tables in var/master-data/test, to see how they get re-created).
Only a few records are handled by now: LONG_TRANSACTION_ID,
COMMIT, FILE_ID, REDO_CREATE_TABLE; place is ready for
REDO_INSERT_ROW_HEAD where I could use Monty's help (search for
"Monty" in the file). Note: changes to the index pages, index's header
and bitmap pages are not properly logged yet, so don't expect
the program to work with that.
2007-06-26 16:49:23 +02:00
|
|
|
#define LSN_REPLACE_OFFSET(L, S) (LSN_FILE_NO_PART(L) | (S))
|
2007-02-02 08:41:32 +01:00
|
|
|
|
- WL#3239 "log CREATE TABLE in Maria"
- WL#3240 "log DROP TABLE in Maria"
- similarly, log RENAME TABLE, REPAIR/OPTIMIZE TABLE, and
DELETE no_WHERE_clause (== the DELETE which just truncates the files)
- create_rename_lsn added to MARIA_SHARE's state
- all these operations (except DROP TABLE) also update the table's
create_rename_lsn, which is needed for the correctness of
Recovery (see function comment of _ma_repair_write_log_record()
in ma_check.c)
- write a COMMIT record when transaction commits.
- don't log REDOs/UNDOs if this is an internal temporary table
like inside ALTER TABLE (I expect this to be a big win). There was
already no logging for user-created "CREATE TEMPORARY" tables.
- don't fsync files/directories if the table is not transactional
- in translog_write_record(), autogenerate a 2-byte-id for the table
and log the "id->name" pair (LOGREC_FILE_ID); log
LOGREC_LONG_TRANSACTION_ID; automatically store
the table's 2-byte-id in any log record.
- preparations for Checkpoint: translog_get_horizon(); pausing Checkpoint
when some dirty pages are unknown; capturing trn->rec_lsn,
trn->first_undo_lsn for Checkpoint and log's low-water-mark computing.
- assertions, comments.
storage/maria/Makefile.am:
more files to build
storage/maria/ha_maria.cc:
- logging a REPAIR log record if REPAIR/OPTIMIZE was successful.
- ha_maria::data_file_type does not have to be set in every info()
call, just do it once in open().
- if caller said that transactionality can be disabled (like if
caller is ALTER TABLE) i.e. thd->transaction.on==FALSE, then we
temporarily disable transactionality of the table in external_lock();
that will ensure that no REDOs/UNDOs are logged for this possibly
massive write operation (they are not needed, as if any write fails,
the table will be dropped). We re-enable in external_lock(F_UNLCK),
which in ALTER TABLE happens before the tmp table replaces the original
one (which is good, as thus the final table will have a REDO RENAME
and a correct create_rename_lsn).
- when we commit we also have to write a log record, so
trnman_commit_trn() calls become ma_commit() calls
- at end of engine's initialization, we are potentially entering a
multi-threaded dangerous world (clients are going to be accepted)
and so some assertions of mutex-owning become enforceable, for that
we set maria_multi_threaded=TRUE (see ma_control_file.c)
storage/maria/ha_maria.h:
new member ha_maria::save_transactional (see also ha_maria.cc)
storage/maria/ma_blockrec.c:
- fixing comments according to discussion with Monty
- if a table is transactional but temporarily non-transactional
(like in ALTER TABLE), we need to give a sensible LSN to the pages
(and, if we give 0, pagecache asserts).
- translog_write_record() now takes care of storing the share's
2-byte-id in the log record
storage/maria/ma_blockrec.h:
fixing comment according to discussion with Monty
storage/maria/ma_check.c:
When REPAIR/OPTIMIZE modify the data/index file, if this is a
transactional table, they must sync it; if they remove files or rename
files, they must sync the directory, so that everything is durable.
This is just applying to REPAIR/OPTIMIZE the logic already implemented
in CREATE/DROP/RENAME a few months ago.
Adding a function to write a LOGREC_REPAIR_TABLE at end of
REPAIR/OPTIMIZE (called only by ha_maria, not by maria_chk), and
to update the table's create_rename_lsn.
storage/maria/ma_close.c:
fix for a future bug
storage/maria/ma_control_file.c:
ensuring that if Maria is running in multi-threaded mode, anybody
wanting to write to the control file and update
last_checkpoint_lsn/last_logno owns the log's lock.
storage/maria/ma_control_file.h:
see ma_control_file.c
storage/maria/ma_create.c:
when creating a table:
- sync it and its directory only if this is a transactional table
and there is a log (no point in syncing in maria_chk)
- decouple the two uses of linkname/linkname_ptr (for index file and
for data file) into more variables, as we need to know all links
until the moment we write the LOGREC_CREATE_TABLE.
- set share.data_file_type early so that _ma_initialize_data_file()
knows it (Monty's bugfix so that a table always has at least a bitmap
page when it is created; so data-file is not 0 bytes anymore).
- log a LOGREC_CREATE_TABLE; it contains the bytes which we have
just written to the index file's header. Update table's
create_rename_lsn.
- syncing of kfile had been bugified in a previous merge, correcting
- syncing of dfile is now needed as it's not empty anymore
- in _ma_initialize_data_file(), use share's block_size and not the
global one. This is a gratuitous change, both variables are equal,
just that I find it more future-proof to use share-bound variable
rather than global one.
storage/maria/ma_delete_all.c:
log a LOGREC_DELETE_ALL record when doing ma_delete_all_rows();
update create_rename_lsn then.
storage/maria/ma_delete_table.c:
- logging LOGREC_DROP_TABLE; knowing if this is needed, requires
knowing if the table is transactional, which requires opening the
table.
- we need to sync directories only if the table is transactional
storage/maria/ma_extra.c:
questions
storage/maria/ma_init.c:
when maria_end() is called, engine is not multithreaded
storage/maria/ma_loghandler.c:
- translog_inited has to be visible to ma_create() (see how it is used
in ma_create())
- checkpoint record will be a single record, not three
- no REDO for TRUNCATE (TRUNCATE calls ma_create() internally so will
log a REDO_CREATE)
- adding REDO for DELETE no_WHERE_clause (fast DELETE of all rows by
truncating the files), REPAIR.
- MY_WAIT_IF_FULL to wait&retry if a log write hits a full disk
- in translog_write_record(), if MARIA_SHARE does not yet have a
2-byte-id, generate one for it and log LOGREC_FILE_ID; automatically
store this short id into log records.
- in translog_write_record(), if transaction has not logged its
long trid, log LOGREC_LONG_TRANSACTION_ID.
- For Checkpoint, we need to know the current end-of-log: adding
translog_get_horizon().
- For Control File, adding an assertion that the thread owns the
log's lock (control file is protected by this lock)
storage/maria/ma_loghandler.h:
Changes in log records (see ma_loghandler.c).
new prototypes, new functions.
storage/maria/ma_loghandler_lsn.h:
adding a type LSN_WITH_FLAGS especially for TRN::first_undo_lsn,
where the most significant byte is used for flags.
storage/maria/ma_open.c:
storing the create_rename_lsn in the index file's header (in the
state, precisely) and retrieving it from there.
storage/maria/ma_pagecache.c:
- my set_if_bigger was wrong, correcting it
- if the first_in_switch list is not empty, it means that
changed_blocks misses some dirty pages, so Checkpoint cannot run and
needs to wait. A variable missing_blocks_in_changed_list is added to
tell that (should it be named missing_blocks_in_changed_blocks?)
- pagecache_collect_changed_blocks_with_lsn() now also tells the
minimum rec_lsn (needed for low-water mark computation).
storage/maria/ma_pagecache.h:
see ma_pagecache.c
storage/maria/ma_panic.c:
comment
storage/maria/ma_range.c:
comment
storage/maria/ma_rename.c:
- logging LOGREC_RENAME_TABLE; knowing if this is needed, requires
knowing if the table is transactional, which requires opening the
table.
- update create_rename_lsn
- we need to sync directories only if the table is transactional
storage/maria/ma_static.c:
comment
storage/maria/ma_test_all.sh:
- tip for Valgrind-ing ma_test_all
- do "export maria_path=somepath" before calling ma_test_all,
if you want to run ma_test_all out of storage/maria (useful
to have parallel runs, like one normal and one Valgrind, they
must not use the same tables so need to run in different directories)
storage/maria/maria_def.h:
- state now contains, in memory and on disk, the create_rename_lsn
- share now contains a 2-byte-id
storage/maria/trnman.c:
preparations for Checkpoint: capture trn->rec_lsn, trn->first_undo_lsn;
minimum first_undo_lsn needed to know log's low-water-mark
storage/maria/trnman.h:
using most significant byte of first_undo_lsn to hold miscellaneous
flags, for now TRANSACTION_LOGGED_LONG_ID.
dummy_transaction_object is already declared in ma_static.c.
storage/maria/trnman_public.h:
dummy_transaction_object was declared in all files including
trnman_public.h, while in fact it's a single object.
new prototype
storage/maria/unittest/ma_test_loghandler-t.c:
update for new prototype
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
update for new prototype
storage/maria/unittest/ma_test_loghandler_multithread-t.c:
update for new prototype
storage/maria/unittest/ma_test_loghandler_pagecache-t.c:
update for new prototype
storage/maria/ma_commit.c:
function which wraps:
- writing a LOGREC_COMMIT record (==commit on disk)
- calling trnman_commit_trn() (=commit in memory)
storage/maria/ma_commit.h:
new header file
.tree-is-private:
this file is now needed to keep our tree private (don't push it
to public trees). When 5.1 is merged into mysql-maria, we can abandon
our maria-specific post-commit trigger; .tree_is_private will take
care of keeping commit mails private. Don't push this file to public
trees.
2007-06-22 14:49:37 +02:00
|
|
|
/*
|
2007-07-02 19:45:15 +02:00
|
|
|
an 8-byte type whose most significant uchar is used for "flags"; 7
|
- WL#3239 "log CREATE TABLE in Maria"
- WL#3240 "log DROP TABLE in Maria"
- similarly, log RENAME TABLE, REPAIR/OPTIMIZE TABLE, and
DELETE no_WHERE_clause (== the DELETE which just truncates the files)
- create_rename_lsn added to MARIA_SHARE's state
- all these operations (except DROP TABLE) also update the table's
create_rename_lsn, which is needed for the correctness of
Recovery (see function comment of _ma_repair_write_log_record()
in ma_check.c)
- write a COMMIT record when transaction commits.
- don't log REDOs/UNDOs if this is an internal temporary table
like inside ALTER TABLE (I expect this to be a big win). There was
already no logging for user-created "CREATE TEMPORARY" tables.
- don't fsync files/directories if the table is not transactional
- in translog_write_record(), autogenerate a 2-byte-id for the table
and log the "id->name" pair (LOGREC_FILE_ID); log
LOGREC_LONG_TRANSACTION_ID; automatically store
the table's 2-byte-id in any log record.
- preparations for Checkpoint: translog_get_horizon(); pausing Checkpoint
when some dirty pages are unknown; capturing trn->rec_lsn,
trn->first_undo_lsn for Checkpoint and log's low-water-mark computing.
- assertions, comments.
storage/maria/Makefile.am:
more files to build
storage/maria/ha_maria.cc:
- logging a REPAIR log record if REPAIR/OPTIMIZE was successful.
- ha_maria::data_file_type does not have to be set in every info()
call, just do it once in open().
- if caller said that transactionality can be disabled (like if
caller is ALTER TABLE) i.e. thd->transaction.on==FALSE, then we
temporarily disable transactionality of the table in external_lock();
that will ensure that no REDOs/UNDOs are logged for this possibly
massive write operation (they are not needed, as if any write fails,
the table will be dropped). We re-enable in external_lock(F_UNLCK),
which in ALTER TABLE happens before the tmp table replaces the original
one (which is good, as thus the final table will have a REDO RENAME
and a correct create_rename_lsn).
- when we commit we also have to write a log record, so
trnman_commit_trn() calls become ma_commit() calls
- at end of engine's initialization, we are potentially entering a
multi-threaded dangerous world (clients are going to be accepted)
and so some assertions of mutex-owning become enforceable, for that
we set maria_multi_threaded=TRUE (see ma_control_file.c)
storage/maria/ha_maria.h:
new member ha_maria::save_transactional (see also ha_maria.cc)
storage/maria/ma_blockrec.c:
- fixing comments according to discussion with Monty
- if a table is transactional but temporarily non-transactional
(like in ALTER TABLE), we need to give a sensible LSN to the pages
(and, if we give 0, pagecache asserts).
- translog_write_record() now takes care of storing the share's
2-byte-id in the log record
storage/maria/ma_blockrec.h:
fixing comment according to discussion with Monty
storage/maria/ma_check.c:
When REPAIR/OPTIMIZE modify the data/index file, if this is a
transactional table, they must sync it; if they remove files or rename
files, they must sync the directory, so that everything is durable.
This is just applying to REPAIR/OPTIMIZE the logic already implemented
in CREATE/DROP/RENAME a few months ago.
Adding a function to write a LOGREC_REPAIR_TABLE at end of
REPAIR/OPTIMIZE (called only by ha_maria, not by maria_chk), and
to update the table's create_rename_lsn.
storage/maria/ma_close.c:
fix for a future bug
storage/maria/ma_control_file.c:
ensuring that if Maria is running in multi-threaded mode, anybody
wanting to write to the control file and update
last_checkpoint_lsn/last_logno owns the log's lock.
storage/maria/ma_control_file.h:
see ma_control_file.c
storage/maria/ma_create.c:
when creating a table:
- sync it and its directory only if this is a transactional table
and there is a log (no point in syncing in maria_chk)
- decouple the two uses of linkname/linkname_ptr (for index file and
for data file) into more variables, as we need to know all links
until the moment we write the LOGREC_CREATE_TABLE.
- set share.data_file_type early so that _ma_initialize_data_file()
knows it (Monty's bugfix so that a table always has at least a bitmap
page when it is created; so data-file is not 0 bytes anymore).
- log a LOGREC_CREATE_TABLE; it contains the bytes which we have
just written to the index file's header. Update table's
create_rename_lsn.
- syncing of kfile had been bugified in a previous merge, correcting
- syncing of dfile is now needed as it's not empty anymore
- in _ma_initialize_data_file(), use share's block_size and not the
global one. This is a gratuitous change, both variables are equal,
just that I find it more future-proof to use share-bound variable
rather than global one.
storage/maria/ma_delete_all.c:
log a LOGREC_DELETE_ALL record when doing ma_delete_all_rows();
update create_rename_lsn then.
storage/maria/ma_delete_table.c:
- logging LOGREC_DROP_TABLE; knowing if this is needed, requires
knowing if the table is transactional, which requires opening the
table.
- we need to sync directories only if the table is transactional
storage/maria/ma_extra.c:
questions
storage/maria/ma_init.c:
when maria_end() is called, engine is not multithreaded
storage/maria/ma_loghandler.c:
- translog_inited has to be visible to ma_create() (see how it is used
in ma_create())
- checkpoint record will be a single record, not three
- no REDO for TRUNCATE (TRUNCATE calls ma_create() internally so will
log a REDO_CREATE)
- adding REDO for DELETE no_WHERE_clause (fast DELETE of all rows by
truncating the files), REPAIR.
- MY_WAIT_IF_FULL to wait&retry if a log write hits a full disk
- in translog_write_record(), if MARIA_SHARE does not yet have a
2-byte-id, generate one for it and log LOGREC_FILE_ID; automatically
store this short id into log records.
- in translog_write_record(), if transaction has not logged its
long trid, log LOGREC_LONG_TRANSACTION_ID.
- For Checkpoint, we need to know the current end-of-log: adding
translog_get_horizon().
- For Control File, adding an assertion that the thread owns the
log's lock (control file is protected by this lock)
storage/maria/ma_loghandler.h:
Changes in log records (see ma_loghandler.c).
new prototypes, new functions.
storage/maria/ma_loghandler_lsn.h:
adding a type LSN_WITH_FLAGS especially for TRN::first_undo_lsn,
where the most significant byte is used for flags.
storage/maria/ma_open.c:
storing the create_rename_lsn in the index file's header (in the
state, precisely) and retrieving it from there.
storage/maria/ma_pagecache.c:
- my set_if_bigger was wrong, correcting it
- if the first_in_switch list is not empty, it means that
changed_blocks misses some dirty pages, so Checkpoint cannot run and
needs to wait. A variable missing_blocks_in_changed_list is added to
tell that (should it be named missing_blocks_in_changed_blocks?)
- pagecache_collect_changed_blocks_with_lsn() now also tells the
minimum rec_lsn (needed for low-water mark computation).
storage/maria/ma_pagecache.h:
see ma_pagecache.c
storage/maria/ma_panic.c:
comment
storage/maria/ma_range.c:
comment
storage/maria/ma_rename.c:
- logging LOGREC_RENAME_TABLE; knowing if this is needed, requires
knowing if the table is transactional, which requires opening the
table.
- update create_rename_lsn
- we need to sync directories only if the table is transactional
storage/maria/ma_static.c:
comment
storage/maria/ma_test_all.sh:
- tip for Valgrind-ing ma_test_all
- do "export maria_path=somepath" before calling ma_test_all,
if you want to run ma_test_all out of storage/maria (useful
to have parallel runs, like one normal and one Valgrind, they
must not use the same tables so need to run in different directories)
storage/maria/maria_def.h:
- state now contains, in memory and on disk, the create_rename_lsn
- share now contains a 2-byte-id
storage/maria/trnman.c:
preparations for Checkpoint: capture trn->rec_lsn, trn->first_undo_lsn;
minimum first_undo_lsn needed to know log's low-water-mark
storage/maria/trnman.h:
using most significant byte of first_undo_lsn to hold miscellaneous
flags, for now TRANSACTION_LOGGED_LONG_ID.
dummy_transaction_object is already declared in ma_static.c.
storage/maria/trnman_public.h:
dummy_transaction_object was declared in all files including
trnman_public.h, while in fact it's a single object.
new prototype
storage/maria/unittest/ma_test_loghandler-t.c:
update for new prototype
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
update for new prototype
storage/maria/unittest/ma_test_loghandler_multithread-t.c:
update for new prototype
storage/maria/unittest/ma_test_loghandler_pagecache-t.c:
update for new prototype
storage/maria/ma_commit.c:
function which wraps:
- writing a LOGREC_COMMIT record (==commit on disk)
- calling trnman_commit_trn() (=commit in memory)
storage/maria/ma_commit.h:
new header file
.tree-is-private:
this file is now needed to keep our tree private (don't push it
to public trees). When 5.1 is merged into mysql-maria, we can abandon
our maria-specific post-commit trigger; .tree_is_private will take
care of keeping commit mails private. Don't push this file to public
trees.
2007-06-22 14:49:37 +02:00
|
|
|
other bytes are a LSN.
|
|
|
|
*/
|
|
|
|
typedef LSN LSN_WITH_FLAGS;
|
|
|
|
#define LSN_WITH_FLAGS_TO_LSN(x) (x & ULL(0x00FFFFFFFFFFFFFF))
|
|
|
|
#define LSN_WITH_FLAGS_TO_FLAGS(x) (x & ULL(0xFF00000000000000))
|
|
|
|
|
WL#3072 Maria Recovery
- new program maria_read_log to display and apply log records
found in a Maria log (see file's revision comment)
- minor, misc fixes
storage/maria/Makefile.am:
new program maria_read_log
storage/maria/ha_maria.cc:
create control file if missing
storage/maria/ma_blockrec.c:
0 -> LSN_IMPOSSIBLE; comments
storage/maria/ma_checkpoint.h:
preparations for Checkpoint module
storage/maria/ma_close.c:
comment
storage/maria/ma_control_file.c:
renaming constants.
Possibility to say "open control file but don't create it if it's
missing" (used by maria_read_log which does not want to create
anything)
storage/maria/ma_control_file.h:
renaming constants
storage/maria/ma_create.c:
I had duplicated "linkname" and "linkname_ptr", now I see it's not
needed, reverting. Indeed those variables don't contain interesting
information; fixing log record accordingly (the links are in
ci->data/index_file_name). Storing keystart in log record is needed,
to know at which size we must extend the file if we replay
LOGREC_CREATE_TABLE.
storage/maria/ma_loghandler.c:
some structures need to be known to maria_read_log.c, taking
them to ma_loghandler.h
storage/maria/ma_loghandler.h:
we have page_store, adding page_korr.
translog_lock() made public, because Checkpoint will need it (to
write to control file).
Some structures moved from ma_loghandler.c because maria_read_log.c
needs them (needs to know the execute-in-REDO-phase hooks of each
record).
storage/maria/ma_loghandler_lsn.h:
constants defined in ma_control_file.h serve everywhere,
and they relate to LSNs, so putting them in ma_loghandler_lsn.h.
Stronger constraints in LSN_VALID().
storage/maria/ma_pagecache.c:
renaming constants
storage/maria/ma_recovery.h:
copyright
storage/maria/ma_test1.c:
new prototype
storage/maria/ma_test2.c:
new prototype
storage/maria/trnman_public.h:
double-inclusion safe
storage/maria/unittest/ma_control_file-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler_multithread-t.c:
constants renamed, new prototype
storage/maria/unittest/ma_test_loghandler_pagecache-t.c:
constants renamed, new prototype
storage/myisam/mi_close.c:
comment
storage/maria/maria_read_log.c:
program to read and print log records from a Maria transaction log,
and optionally apply them to tables. Very basic, early version.
Should serve as a base for Recovery's code. Designed to be idempotent.
Create a log by running maria.test, then cd to var/master-data
and run "maria_read_log --only-display" to see info about records;
run "maria_read_log --display-and-apply" to also apply the records
to tables (it's more interesting if you first wipe out the
tables in var/master-data/test, to see how they get re-created).
Only a few records are handled by now: LONG_TRANSACTION_ID,
COMMIT, FILE_ID, REDO_CREATE_TABLE; place is ready for
REDO_INSERT_ROW_HEAD where I could use Monty's help (search for
"Monty" in the file). Note: changes to the index pages, index's header
and bitmap pages are not properly logged yet, so don't expect
the program to work with that.
2007-06-26 16:49:23 +02:00
|
|
|
#define FILENO_IMPOSSIBLE 0 /**< log file's numbering starts at 1 */
|
|
|
|
#define LOG_OFFSET_IMPOSSIBLE 0 /**< log always has a header */
|
|
|
|
#define LSN_IMPOSSIBLE 0
|
2007-08-13 21:54:29 +02:00
|
|
|
/* following LSN also is impossible */
|
|
|
|
#define LSN_ERROR 1
|
2007-08-02 12:42:49 +02:00
|
|
|
|
WL#3072 Maria recovery
* create page cache before initializing engine and not after, because
Maria's recovery needs a page cache
* make the creation of a bitmap page more crash-resistent
* bugfix (see ma_blockrec.c)
* back to old way: create an 8k bitmap page when creating table
* preparations for the UNDO phase: recreate TRNs
* preparations for Checkpoint: list of dirty pages, testing
of rec_lsn to know if page should be skipped during Recovery
(unused in this patch as no Checkpoint module pushed yet)
* maria_chk tags repaired table with a special LSN
* reworking all around in ma_recovery.c (less duplication)
mysys/my_realloc.c:
noted an issue in my_realloc()
sql/mysqld.cc:
page cache needs to be created before engines are initialized,
because Maria's initialization may do a recovery which needs
the page cache.
storage/maria/ha_maria.cc:
update to new prototype
storage/maria/ma_bitmap.c:
when creating the first bitmap page we used chsize to 8192 bytes then
pwrite (overwrite) the last 2 bytes (8191-8192). If crash between
the two operations, this leaves a bitmap page full without its end
marker. A later recovery may try to read this page and find it
exists and misses a marker and conclude it's corrupted and fail.
Changing the chsize to only 8190 bytes: recovery will then find
the page is too short and recreate it entirely.
storage/maria/ma_blockrec.c:
Fix for a bug: when executing a REDO, if the data page is created,
data_file_length was increased before _ma_bitmap_set():
_ma_bitmap_set() called _ma_read_bitmap_page() which, due to the
increased data_file_length, expected to find a bitmap page on disk
with a correct end marker; if the bitmap page didn't exist already
in fact, this failed. Fixed by increasing data_file_length only after
_ma_read_bitmap_page() has created the new bitmap page correctly.
This bug could happen every time a REDO is about creating a new
bitmap page.
storage/maria/ma_check.c:
empty data file has a bitmap page
storage/maria/ma_control_file.c:
useless parameter to ma_control_file_create_or_open(), just
test if this is recovery.
storage/maria/ma_control_file.h:
new prototype
storage/maria/ma_create.c:
Back to how it was before: maria_create() creates an 8k bitmap page.
Thus (bugfix) data_file_length needs to reflect this instead of being 0.
storage/maria/ma_loghandler.c:
as ma_test1 and ma_test2 now use real transactions and not
dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always
about real transactions, can assert this.
A function for Recovery to assign a short id to a table.
storage/maria/ma_loghandler.h:
new function
storage/maria/ma_loghandler_lsn.h:
maria_chk tags repaired tables with this LSN
storage/maria/ma_open.c:
* enforce that DMLs on transactional tables use real transactions
and not dummy_transaction_object.
* test if table was repaired with maria_chk (which has to been
seen as an import of an external table into the server), test
validity of create_rename_lsn (header corruption detection)
* comments.
storage/maria/ma_recovery.c:
* preparations for the UNDO phase: recreate TRNs
* preparations for Checkpoint: list of dirty pages, testing
of rec_lsn to know if page should be skipped during Recovery
(unused in this patch as no Checkpoint module pushed yet)
* reworking all around (less duplication)
storage/maria/ma_recovery.h:
a parameter to say if the UNDO phase should be skipped
storage/maria/maria_chk.c:
tag repaired tables with a special LSN
storage/maria/maria_read_log.c:
* update to new prototype
* no UNDO phase in maria_read_log for now
storage/maria/trnman.c:
* a function for Recovery to create a transaction (TRN), needed
in the UNDO phase
* a function for Recovery to grab an existing transaction, needed
in the UNDO phase (rollback all existing transactions)
storage/maria/trnman_public.h:
new functions
2007-08-29 16:43:01 +02:00
|
|
|
/** @brief some impossible LSN serve as markers */
|
- WL#3072 Maria Recovery:
Recovery of state.records (the count of records which is stored into
the header of the index file). For that, state.is_of_lsn is introduced;
logic is explained in ma_recovery.c (look for "Recovery of the state").
The net gain is that in case of crash, we now recover state.records,
and it is idempotent (ma_test_recovery tests it).
state.checksum is not recovered yet, mail sent for discussion.
- WL#3071 Maria Checkpoint: preparation for it, by protecting
all modifications of the state in memory or on disk with intern_lock
(with the exception of the really-often-modified state.records,
which is now protected with the log's lock, see ma_recovery.c
(look for "Recovery of the state"). Also, if maria_close() sees that
Checkpoint is looking at this table it will not my_free() the share.
- don't compute row's checksum twice in case of UPDATE (correction
to a bugfix I made yesterday).
storage/maria/ha_maria.cc:
protect state write with intern_lock (against Checkpoint)
storage/maria/ma_blockrec.c:
* don't reset trn->rec_lsn in _ma_unpin_all_pages(), because it
should wait until we have corrected the allocation in the bitmap
(as the REDO can serve to correct the allocation during Recovery);
introducing _ma_finalize_row() for that.
* In a changeset yesterday I moved computation of the checksum
into write_block_record(), to fix a bug in UPDATE. Now I notice
that maria_update() already computes the checksum, it's just that
it puts it into info->cur_row while _ma_update_block_record()
uses info->new_row; so, removing the checksum computation from
write_block_record(), putting it back into allocate_and_write_block_record()
(which is called only by INSERT and UNDO_DELETE), and copying
cur_row->checksum into new_row->checksum in _ma_update_block_record().
storage/maria/ma_check.c:
new prototypes, they will take intern_lock when writing the state;
also take intern_lock when changing share->kfile. In both cases
this is to protect against Checkpoint reading/writing the state or reading
kfile at the same time.
Not updating create_rename_lsn directly at end of write_log_record_for_repair()
as it wouldn't have intern_lock.
storage/maria/ma_close.c:
Checkpoint builds a list of shares (under THR_LOCK_maria), then it
handles each such share (under intern_lock) (doing flushing etc);
if maria_close() freed this share between the two, Checkpoint
would see a bad pointer. To avoid this, when building the list Checkpoint
marks each share, so that maria_close() knows it should not free it
and Checkpoint will free it itself.
Extending the zone covered by intern_lock to protect against
Checkpoint reading kfile, writing state.
storage/maria/ma_create.c:
When we update create_rename_lsn, we also update is_of_lsn to
the same value: it is logical, and allows us to test in maria_open()
that the former is not bigger than the latter (the contrary is a sign
of index header corruption, or severe logging bug which hinders
Recovery, table needs a repair).
_ma_update_create_rename_lsn_on_disk() also writes is_of_lsn;
it now operates under intern_lock (protect against Checkpoint),
a shortcut function is available for cases where acquiring
intern_lock is not needed (table's creation or first open).
storage/maria/ma_delete.c:
if table is transactional, "records" is already decremented
when logging UNDO_ROW_DELETE.
storage/maria/ma_delete_all.c:
comments
storage/maria/ma_extra.c:
Protect modifications of the state, in memory and/or on disk,
with intern_lock, against a concurrent Checkpoint.
When state goes to disk, update it's is_of_lsn (by calling
the new _ma_state_info_write()).
In HA_EXTRA_FORCE_REOPEN, don't set share->changed to 0 (undoing
a change I made a few days ago) and ASK_MONTY
storage/maria/ma_locking.c:
no real code change here.
storage/maria/ma_loghandler.c:
Log-write-hooks for updating "state.records" under log's mutex
when writing/updating/deleting a row or deleting all rows.
storage/maria/ma_loghandler_lsn.h:
merge (make LSN_ERROR and LSN_REPAIRED_BY_MARIA_CHK different)
storage/maria/ma_open.c:
When opening a table verify that is_of_lsn >= create_rename_lsn; if
false the header must be corrupted.
_ma_state_info_write() is split in two: _ma_state_info_write_sub()
which is the old _ma_state_info_write(), and _ma_state_info_write()
which additionally takes intern_lock if requested (to protect
against Checkpoint) and updates is_of_lsn.
_ma_open_keyfile() should change kfile.file under intern_lock
to protect Checkpoint from reading a wrong kfile.file.
storage/maria/ma_recovery.c:
Recovery of state.records: when the REDO phase sees UNDO_ROW_INSERT
which has a LSN > state.is_of_lsn it increments state.records.
Same for UNDO_ROW_DELETE and UNDO_ROW_PURGE.
When closing a table during Recovery, we know its state is at least
as new as the current log record we are looking at, so increase
is_of_lsn to the LSN of the current log record.
storage/maria/ma_rename.c:
update for new behaviour of _ma_update_create_rename_lsn_on_disk().
storage/maria/ma_test1.c:
update to new prototype
storage/maria/ma_test2.c:
update to new prototype (actually prototype was changed days ago,
but compiler does not complain about the extra argument??)
storage/maria/ma_test_recovery.expected:
new result file of ma_test_recovery. Improvements: record
count read from index's header is now always correct.
storage/maria/ma_test_recovery:
"rm" fails if file does not exist. Redirect stderr of script.
storage/maria/ma_write.c:
if table is transactional, "records" is already incremented when
logging UNDO_ROW_INSERT. Comments.
storage/maria/maria_chk.c:
update is_of_lsn too
storage/maria/maria_def.h:
- MARIA_STATE_INFO::is_of_lsn which is used by Recovery. It is stored
into the index file's header.
- Checkpoint can now mark a table as "don't free this", and maria_close()
can reply "ok then you will free it".
- new functions
storage/maria/maria_pack.c:
update for new name
2007-09-07 15:02:30 +02:00
|
|
|
#define LSN_REPAIRED_BY_MARIA_CHK ((LSN)2)
|
WL#3072 Maria recovery
* create page cache before initializing engine and not after, because
Maria's recovery needs a page cache
* make the creation of a bitmap page more crash-resistent
* bugfix (see ma_blockrec.c)
* back to old way: create an 8k bitmap page when creating table
* preparations for the UNDO phase: recreate TRNs
* preparations for Checkpoint: list of dirty pages, testing
of rec_lsn to know if page should be skipped during Recovery
(unused in this patch as no Checkpoint module pushed yet)
* maria_chk tags repaired table with a special LSN
* reworking all around in ma_recovery.c (less duplication)
mysys/my_realloc.c:
noted an issue in my_realloc()
sql/mysqld.cc:
page cache needs to be created before engines are initialized,
because Maria's initialization may do a recovery which needs
the page cache.
storage/maria/ha_maria.cc:
update to new prototype
storage/maria/ma_bitmap.c:
when creating the first bitmap page we used chsize to 8192 bytes then
pwrite (overwrite) the last 2 bytes (8191-8192). If crash between
the two operations, this leaves a bitmap page full without its end
marker. A later recovery may try to read this page and find it
exists and misses a marker and conclude it's corrupted and fail.
Changing the chsize to only 8190 bytes: recovery will then find
the page is too short and recreate it entirely.
storage/maria/ma_blockrec.c:
Fix for a bug: when executing a REDO, if the data page is created,
data_file_length was increased before _ma_bitmap_set():
_ma_bitmap_set() called _ma_read_bitmap_page() which, due to the
increased data_file_length, expected to find a bitmap page on disk
with a correct end marker; if the bitmap page didn't exist already
in fact, this failed. Fixed by increasing data_file_length only after
_ma_read_bitmap_page() has created the new bitmap page correctly.
This bug could happen every time a REDO is about creating a new
bitmap page.
storage/maria/ma_check.c:
empty data file has a bitmap page
storage/maria/ma_control_file.c:
useless parameter to ma_control_file_create_or_open(), just
test if this is recovery.
storage/maria/ma_control_file.h:
new prototype
storage/maria/ma_create.c:
Back to how it was before: maria_create() creates an 8k bitmap page.
Thus (bugfix) data_file_length needs to reflect this instead of being 0.
storage/maria/ma_loghandler.c:
as ma_test1 and ma_test2 now use real transactions and not
dummy_transaction_object, REDO for INSERT/UPDATE/DELETE are always
about real transactions, can assert this.
A function for Recovery to assign a short id to a table.
storage/maria/ma_loghandler.h:
new function
storage/maria/ma_loghandler_lsn.h:
maria_chk tags repaired tables with this LSN
storage/maria/ma_open.c:
* enforce that DMLs on transactional tables use real transactions
and not dummy_transaction_object.
* test if table was repaired with maria_chk (which has to been
seen as an import of an external table into the server), test
validity of create_rename_lsn (header corruption detection)
* comments.
storage/maria/ma_recovery.c:
* preparations for the UNDO phase: recreate TRNs
* preparations for Checkpoint: list of dirty pages, testing
of rec_lsn to know if page should be skipped during Recovery
(unused in this patch as no Checkpoint module pushed yet)
* reworking all around (less duplication)
storage/maria/ma_recovery.h:
a parameter to say if the UNDO phase should be skipped
storage/maria/maria_chk.c:
tag repaired tables with a special LSN
storage/maria/maria_read_log.c:
* update to new prototype
* no UNDO phase in maria_read_log for now
storage/maria/trnman.c:
* a function for Recovery to create a transaction (TRN), needed
in the UNDO phase
* a function for Recovery to grab an existing transaction, needed
in the UNDO phase (rollback all existing transactions)
storage/maria/trnman_public.h:
new functions
2007-08-29 16:43:01 +02:00
|
|
|
|
2007-08-02 12:42:49 +02:00
|
|
|
/**
|
|
|
|
@brief the maximum valid LSN.
|
|
|
|
Unlike ULONGLONG_MAX, it can be safely used in comparison with valid LSNs
|
|
|
|
(ULONGLONG_MAX is too big for correctness of cmp_translog_address()).
|
|
|
|
*/
|
|
|
|
#define LSN_MAX (LSN)ULL(0x00FFFFFFFFFFFFFF)
|
|
|
|
|
2007-02-02 08:41:32 +01:00
|
|
|
#endif
|