Commit graph

347 commits

Author SHA1 Message Date
vdimov
c31479b5f2 Non-functional change: update copyright year to 2010 of the files
that have been modified after 2010-01-01 according to svn.

for f in $(svn log -v -r{2010-01-01}:HEAD |grep "^   M " |cut -b 16- |sort -u) ; do sed -i "" -E 's/(Copyright \(c\) [0-9]{4},) [0-9]{4}, (.*Innobase Oy.+All Rights Reserved)/\1 2010, \2/' $f ; done
2010-03-26 14:19:01 +00:00
jyang
38d926a8d0 branches/zip: This is patch from Inaam that uses red-black tree
to speed up insertions into the flush_list and thus the recovery
process. The patch has been tested by Nokia.
2010-03-23 16:20:36 +00:00
marko
8766f62c90 branches/zip: Add ut_ad(mtr->state == MTR_ACTIVE) to various places. 2010-03-11 10:02:57 +00:00
vasil
6c3ab906d5 Non-functional change: update copyright year to 2010 of the files
that have been modified after 2010-01-01 according to svn.

for f in $(svn log -v -r{2010-01-01}:HEAD |grep "^   M " |cut -b 16- |sort -u) ; do sed -i "" -E 's/(Copyright \(c\) [0-9]{4},) [0-9]{4}, (.*Innobase Oy.+All Rights Reserved)/\1 2010, \2/' $f ; done
2010-02-20 16:45:41 +00:00
marko
b46217f571 branches/zip: Merge revisions 6538:6613 from branches/5.1:
------------------------------------------------------------------------
  r6545 | jyang | 2010-02-03 03:57:32 +0200 (Wed, 03 Feb 2010) | 8 lines
  Changed paths:
     M /branches/5.1/lock/lock0lock.c

  branches/5.1: Fix bug , "SHOW INNODB STATUS deadlock info
  incorrect when deadlock detection aborts". Print the correct
  lock owner when recursive function lock_deadlock_recursive()
  exceeds its maximum depth LOCK_MAX_DEPTH_IN_DEADLOCK_CHECK.

  rb://217, approved by Marko.
  ------------------------------------------------------------------------
  r6613 | inaam | 2010-02-09 20:23:09 +0200 (Tue, 09 Feb 2010) | 11 lines
  Changed paths:
     M /branches/5.1/buf/buf0buf.c
     M /branches/5.1/buf/buf0rea.c
     M /branches/5.1/include/buf0rea.h

  branches/5.1: Fix Bug 
  InnoDB logs error repeatedly when trying to load page into buffer pool

  In buf_page_get_gen() if we are unable to read a page (because of
  corruption or some other reason) we keep on retrying. This fills up
  error log with millions of entries in no time and we'd eventually run
  out of disk space. This patch limits the number of attempts that we
  make (currently set to 100) and after that we abort with a message.

  rb://241 Approved by: Heikki
  ------------------------------------------------------------------------
2010-02-10 08:40:55 +00:00
marko
f82647abe5 branches/zip: Pass the file name and line number of the caller of the
b-tree cursor functions to the buffer pool requests, in order to make
the latch diagnostics more accurate.

buf_page_optimistic_get_func(): Renamed to buf_page_optimistic_get().

btr_page_get_father_node_ptr(), btr_insert_on_non_leaf_level(),
btr_pcur_open(), btr_pcur_open_with_no_init(), btr_pcur_open_on_user_rec(),
btr_pcur_open_at_rnd_pos(), btr_pcur_restore_position(),
btr_cur_open_at_index_side(), btr_cur_open_at_rnd_pos():
Rename the function to _func and add the parameters file, line.
Define wrapper macros with __FILE__, __LINE__.

btr_cur_search_to_nth_level(): Add the parameters file, line.
2010-02-04 11:21:18 +00:00
marko
8a5da89f1d branches/zip: buf_LRU_invalidate_tablespace(): Ensure that prev_bpage
is not relocated when freeing a compressed block.  This avoids the
costly rescan of the LRU list.  (Bug , Issue )

At most one buffer-fix will be active at a time, affecting two blocks:
the buf_page_t and the compressed page frame. This should not block
the memory defragmentation in buf0buddy.c too much.  In fact, it may
avoid unnecessary copying if also prev_bpage belongs to the tablespace
that is being invalidated.

rb://240
2010-02-03 13:01:39 +00:00
marko
4f3d92292a branches/zip: buf_LRU_invalidate_tablespace(): Do not unnecessarily
acquire the block_mutex for every block in the LRU list. Only acquire
it when holding buf_pool_mutex is not sufficient. This should speed up
the function and considerably reduce traffic on the memory bus and
caches.

I noticed this deficiency when working on Issue .
This deficiency popped up again in Issue  (Bug ),
which this fix does not fully address.

rb://78 revision 1 approved by Heikki Tuuri.
2010-01-28 14:23:15 +00:00
marko
6377f605a9 branches/zip: buf_page_get_gen(): Obey recv_no_ibuf_operations
and do not call ibuf_merge_or_delete_for_page() in crash recovery,
before the redo log has been applied.
This could cure some hard-to-repeat, hard-to-explain bugs
related to secondary indexes.

A possible recipe to repeat the bug:

1. update a secondary index leaf page on a compressed table
2. evict the page from the buffer pool while it is still dirty
3. ibuf_insert() something for the page
4. crash
5. crash recovery; ibuf merge would be done too early,
before applying redo log to the sec index page or the ibuf pages
2010-01-21 09:22:52 +00:00
marko
0ccf7024db branches/zip: buf_pool_drop_hash_index(): Check block->page.state
before checking block->is_hashed, because the latter may be uninitialized
right after server startup.
2010-01-13 15:15:29 +00:00
marko
dda7217e08 branches/zip: Free all resources at shutdown. Set pointers to NULL, so
that Valgrind will not complain about freed data structures that are
reachable via pointers.  This addresses Bug  and Bug .

This patch is mostly based on changes copied from branches/embedded-1.0,
mainly c5432, c3439, c3134, c2994, c2978, but also some other code was
copied.  Some added cleanup code is specific to MySQL/InnoDB.

rb://199 approved by Sunny Bains
2009-11-02 09:42:56 +00:00
marko
7d0ad4af4f branches/zip: Fix corruption of buf_pool->LRU_old and improve debug assertions.
This was reported as Issue .

buf_page_set_old(): Assert that blocks may only be set old if
buf_pool->LRU_old is initialized and buf_pool->LRU_old_len is nonzero.
Assert that buf_pool->LRU_old points to the block at the old/new boundary.

buf_LRU_old_adjust_len(): Invoke buf_page_set_old() after adjusting
buf_pool->LRU_old and buf_pool->LRU_old_len, in order not to violate
the added assertions.

buf_LRU_old_init(): Replace buf_page_set_old() with a direct
assignment to bpage->old, because these loops that initialize all the
blocks would temporarily violate the assertions about
buf_pool->LRU_old.

buf_LRU_remove_block(): When setting buf_pool->LRU_old = NULL, also
clear all bpage->old flags and set buf_pool->LRU_old_len = 0.

buf_LRU_add_block_to_end_low(), buf_LRU_add_block_low(): Move the
buf_page_set_old() call later in order not to violate the debug
assertions.  If buf_pool->LRU_old is NULL, set old=FALSE.

buf_LRU_free_block(): Replace the UNIV_LRU_DEBUG assertion with a
dummy buf_page_set_old() call that performs more thorough checks.

buf_LRU_validate(): Do not tolerate garbage in buf_pool->LRU_old_len
even if buf_pool->LRU_old is NULL.  Check that bpage->old is monotonic.

buf_relocate(): Make the UNIV_LRU_DEBUG checks stricter.

buf0buf.h: Revise the documentation of buf_page_t::old and
buf_pool_t::LRU_old_len.
2009-10-29 11:04:11 +00:00
marko
8ee0f733e5 branches/zip: buf_page_set_old(): Improve UNIV_LRU_DEBUG diagnostics
in order to catch the buf_pool->LRU_old corruption reported in Issue .

buf_LRU_old_init(): Set the property from the tail towards the front
of the buf_pool->LRU list, in order not to trip the debug check.
2009-10-28 14:10:34 +00:00
inaam
51c89873d1 branches/zip rb://182
Call fsync() on datafiles after a batch of pages is written to disk
even when skip_innodb_doublewrite is set.

Approved by: Heikki
2009-10-13 16:43:13 +00:00
inaam
1f30efe96f branches/zip rb://176
Do not invalidate buffer pool while an LRU batch is active. Added
code to buf_pool_invalidate() to wait for the running batches to finish.

This patch also resets the state of buf_pool struct at invalidation. This
addresses the concern where buf_pool->freed_page_clock becomes non-zero
because we read in a system tablespace page for file format info at
startup.

Approved by: Marko
2009-10-05 13:45:35 +00:00
marko
3b38bf02cb branches/zip: Do not write to PAGE_INDEX_ID after page creation,
not even when restoring an uncompressed page after a compression failure.

btr_page_reorganize_low(): On compression failure, do not restore
those page header fields that should not be affected by the
reorganization.  Instead, compare the fields.

page_zip_decompress(): Add the parameter ibool all, for copying all
page header fields.  Pass the parameter all=TRUE on block read
completion, redo log application, and page_zip_validate(); pass
all=FALSE in all other cases.

page_zip_reorganize(): Do not restore the uncompressed page on
failure.  It will be restored (to pre-modification state) by the
caller anyway.

rb://167, Issue 
2009-09-28 07:52:25 +00:00
inaam
1490f879e2 branches/zip rb://159
In case of pages that are not made young the counter is incremented
only when the page in question is 'old'. In case of pages that are
made young the counter is incremented in case of all pages. For apple
to apple comparison this patch changes the 'young-making' counter to
consider only 'old' blocks.

Approved by: Marko
2009-09-14 14:20:48 +00:00
marko
3bd1a9fbfd branches/zip: Reduce mutex contention that was introduced when
addressing Bug  (Issue ), in r5703.

buf_page_set_accessed_make_young(): New auxiliary function, called by
buf_page_get_zip(), buf_page_get_gen(),
buf_page_optimistic_get_func(). Call ut_time_ms() outside of
buf_pool_mutex. Use cached access_time.

buf_page_set_accessed(): Add the parameter time_ms, so that
ut_time_ms() need not be called while holding buf_pool_mutex.

buf_page_optimistic_get_func(), buf_page_get_known_nowait(): Read
buf_page_t::access_time without holding buf_pool_mutex. This should be
OK, because the field is only used for heuristic purposes.

buf_page_peek_if_too_old(): If buf_pool->freed_page_clock == 0, return
FALSE, so that we will not waste time moving blocks in the LRU list in
the warm-up phase or when the workload fits in the buffer pool.

rb://156 approved by Sunny Bains
2009-09-10 09:47:09 +00:00
marko
baaf443343 branches/zip: buf_page_release(): De-stutter the function comment. 2009-09-10 09:10:20 +00:00
vasil
549e764431 branches/zip:
Fix a bug in manipulating the variable innodb_old_blocks_pct:

for any value assigned it got that value -1, except for 75. When
assigned 75, it got 75.

  mysql> set global innodb_old_blocks_pct=15;
  Query OK, 0 rows affected (0.00 sec)
  
  mysql> show variables like 'innodb_old_blocks_pct';
  +-----------------------+-------+
  | Variable_name         | Value |
  +-----------------------+-------+
  | innodb_old_blocks_pct | 14    | 
  +-----------------------+-------+
  1 row in set (0.00 sec)
  
  mysql> set global innodb_old_blocks_pct=75;
  Query OK, 0 rows affected (0.00 sec)
  
  mysql> show variables like 'innodb_old_blocks_pct';
  +-----------------------+-------+
  | Variable_name         | Value |
  +-----------------------+-------+
  | innodb_old_blocks_pct | 75    | 
  +-----------------------+-------+

After the fix it gets exactly what was assigned.

Approved by:	Marko (via IM)
2009-09-09 12:35:58 +00:00
marko
36b963cc5e branches/zip: Remove BUF_LRU_INITIAL_RATIO, which should have been removed
together with buf_LRU_get_recent_limit().
2009-09-08 14:50:25 +00:00
marko
4139a2e4cf branches/zip: buf_chunk_not_freed(): Do not acquire block->mutex unless
block->page.state == BUF_BLOCK_FILE_PAGE.  Check that block->page.state
makes sense.

Approved by Sunny Bains over the IM.
2009-08-31 05:10:10 +00:00
inaam
07e2ab7aab branches/zip
Remove redundant TRUE : FALSE from the return statement
2009-08-27 21:43:32 +00:00
inaam
3811787c3e branches/zip
Remove unused macros as we erased the random readahead code in r5703.
Also fixed some comments.
2009-08-27 15:20:35 +00:00
inaam
8658a2c43d branches/zip rb://147
Done away with following two status variables:

innodb_buffer_pool_read_ahead_rnd
innodb_buffer_pool_read_ahead_seq

Introduced two new status variables:
innodb_buffer_pool_read_ahead = number of pages read as part of
readahead since server startup
innodb_buffer_pool_read_ahead_evicted = number of pages that are read
in as readahead but were evicted before ever being accessed since
server startup i.e.: a measure of how badly our readahead is
performing

SHOW INNODB STATUS will show two extra numbers in buffer pool section:
pages read ahead/sec and pages evicted without access/sec

Approved by: Marko
2009-08-27 15:00:27 +00:00
marko
0f7895d477 branches/zip: Replace the constant 3/8 ratio that controls the LRU_old
size with the settable global variable innodb_old_blocks_pct. The
minimum and maximum values are 5 and 95 per cent, respectively. The
default is 100*3/8, in line with the old behavior.

ut_time_ms(): New utility function, to return the current time in
milliseconds. TODO: Is there a more efficient timestamp function, such
as rdtsc divided by a power of two?

buf_LRU_old_threshold_ms: New variable, corresponding to
innodb_old_blocks_time. The value 0 is the default behaviour: no
timeout before making blocks 'new'.

bpage->accessed, bpage->LRU_position, buf_pool->ulint_clock: Remove.

bpage->access_time: New field, replacing bpage->accessed. Protected by
buf_pool_mutex instead of bpage->mutex. Updated when a page is created
or accessed the first time in the buffer pool.

buf_LRU_old_ratio, innobase_old_blocks_pct: New variables,
corresponding to innodb_old_blocks_pct

buf_LRU_old_ratio_update(), innobase_old_blocks_pct_update(): Update
functions for buf_LRU_old_ratio, innobase_old_blocks_pct.

buf_page_peek_if_too_old(): Compare ut_time_ms() to bpage->access_time
if buf_LRU_old_threshold_ms && bpage->old.  Else observe
buf_LRU_old_ratio and bpage->freed_page_clock.

buf_pool_t: Add n_pages_made_young, n_pages_not_made_young,
n_pages_made_young_old, n_pages_not_made_young, for statistics.

buf_print(): Display buf_pool->n_pages_made_young,
buf_pool->n_pages_not_made_young.  This function is only for crash
diagnostics.

buf_print_io(): Display buf_pool->LRU_old_len and quantities derived
from buf_pool->n_pages_made_young, buf_pool->n_pages_not_made_young.
This function is invoked by SHOW ENGINE INNODB STATUS.

rb://129 approved by Heikki Tuuri.  This addresses Bug .
2009-08-27 06:25:00 +00:00
calvin
173f74eabe branches/zip: remove duplicate "the" in comments. 2009-08-06 22:04:03 +00:00
inaam
c34ab748cc branches/zip
Change the read ahead parameter name to innodb_read_ahead_threshold.
Change the meaning of this parameter to signify the number of pages
that must be sequentially accessed for InnoDB to trigger a readahead
request.

Suggested by: Ken
2009-07-20 15:23:15 +00:00
inaam
9af090cb0e branches/zip
Fixed warnings on windows where ulint != ib_uint64_t
2009-07-13 17:04:57 +00:00
inaam
ec40f5cd73 branches/zip rb://138 (REVERT)
Revert the flush neighbors patch as it shows regression in
the benchmarks run by Michael.
2009-07-13 14:48:45 +00:00
inaam
43fceb74f2 branches/zip rb://133
This patch introduces heuristics based flushing rate of dirty pages to
avoid IO bursts at checkpoint.

1) log_capacity / log_generated per second gives us number of seconds
in which ALL dirty pages need to be flushed. Based on this rough
assumption we can say that
n_dirty_pages / (log_capacity / log_generation_rate) = desired_flush_rate

2) We use weighted averages (hard coded to 20 seconds) of
log_generation_rate to avoid resonance.

3) From the desired_flush_rate we subtract the number of pages that have
been flushed due to LRU flushing. That gives us pages that we should
flush as part of flush_list cleanup. And that is the number (capped by
maximum io_capacity) that we try to flush from the master thread.

Knobs:
======

innodb_adaptive_flushing: boolean, global, dynamic, default TRUE.
Since this heuristic is very experimental and has the potential to
dramatically change the IO pattern I think it is a good idea to leave a
knob to turn it off.

Approved by: Heikki
2009-07-08 15:11:40 +00:00
inaam
449e6af3c6 branches/zip rb://138
The current implementation is to try to flush the neighbors of every
page that we flush. This patch makes the following distinction:

1) If the flush is from flush_list AND
2) If the flush is intended to move the oldest_modification LSN ahead
(this happens when a user thread sees little space in the log file and
attempts to flush pages from the buffer pool so that a checkpoint can
be made)

THEN

Do not try to flush the neighbors. Just focus on flushing dirty pages at
the end of flush_list

Approved by: Heikki
2009-07-07 22:00:49 +00:00
marko
4a447cde2e branches/zip: lock_print_info_all_transactions(), buf_read_recv_pages():
Tolerate missing tablespaces (zip_size==ULINT_UNDEFINED).
buf_page_get_gen(): Add ut_ad(ut_is_2pow(zip_size)).

Issue , rb://136 approved by Sunny Bains
2009-06-29 08:54:53 +00:00
marko
17105a0ad9 branches/zip: buf_page_get_gen(): Fix a race condition when reading
buf_fix_count.  This could explain Issue .
Tested by Michael.
2009-06-29 08:24:27 +00:00
marko
c0bda951fa branches/zip: buf_page_get_zip(): Fix a bogus warning about
block_mutex being possibly uninitialized.
2009-06-22 08:31:35 +00:00
marko
2bb0307bde branches/zip: buf_page_get_zip(): Eliminate a buf_page_get_mutex() call.
The function must switch on the block state anyway.
2009-06-16 08:00:48 +00:00
marko
484de4894f branches/zip: buf_page_get_gen(): Reduce mutex holding time by adjusting
buf_pool->n_pend_unzip while only holding buf_pool_mutex.
2009-06-16 07:08:59 +00:00
vasil
a3548774c6 branches/zip:
Fix Mantis Issue#244 fix bug in linear read ahead (no check on access pattern)

The changes are:

1) Take into account access pattern when deciding whether or not to do linear
  read ahead.
2) Expose a knob innodb_read_ahead_factor = [0-64] default (8), dynamic,
  global to control linear read ahead behvior
3) Disable random read ahead. Keep the code for now.

Submitted by:	Inaam (rb://122)
Approved by:	Heikki (rb://122)
2009-06-05 14:13:31 +00:00
marko
68a1ee9960 branches/zip: Add some Doxygen comments for many structs, typedefs,
#defines and global variables.  Many are still missing.
2009-05-26 12:28:49 +00:00
marko
11ff89d994 branches/zip: Add @file comments, and convert decorative
/*********************************
comments to Doxygen /** style like this:
/*****************************//**

This conversion was performed by the following command:

perl -i -e 'while(<ARGV>){if (m|^/\*{30}\**$|) {
s|\*{4}$|//**| if ++$com>1; $_ .= "\@file $ARGV\n" if $com==2}
print; if(eof){$.=0;undef $com}}' */*[ch] include/univ.i
2009-05-25 09:52:29 +00:00
marko
d075e80c49 branches/zip: Split some long lines that were introduced in r5091. 2009-05-25 08:09:45 +00:00
marko
e49dee377b branches/zip: Convert the function comments to Doxygen format.
This patch was created by running the following commands:

for i in */*[ch]; do doxygenify.pl $i; done
perl -i -pe 's#\*{3} \*/$#****/#' */*[ch]

where doxygenify.pl is
https://svn.innodb.com/svn/misc/trunk/tools/doxygenify.pl r510

Verified the consistency as follows:

(0) not too many /* in: */ or /* out: */ comments left in the code:
grep -l '/\*\s*\(in\|out\)[,:/]' */*[ch]

(1) no difference when ignoring blank lines, after stripping all
C90-style /* comments */, including multi-line ones, before and after
applying this patch:

perl -i -e 'undef $/;while(<ARGV>){s#/\*(.*?)\*/##gs;print}' */*[ch]
diff -I'^\s*$' --exclude .svn -ru TREE1 TREE2

(2) after stripping @return comments and !<, generated a diff and omitted
the hunks where /* out: */ function return comments were removed:

perl -i -e'undef $/;while(<ARGV>){s#!<##g;s#\n\@return\t.*?\*/# \*/#gs;print}'\
 */*[ch]
svn diff|
perl -e 'undef $/;$_=<>;s#\n-\s*/\* out[:,]([^\n]*?)(\n-[^\n]*?)*\*/##gs;print'

Some unintended changes were left.  These will be removed in a
subsequent patch.
2009-05-25 05:30:14 +00:00
marko
79362a389d branches/zip: Remove bogus out: comments of functions returning void. 2009-05-19 07:00:51 +00:00
marko
db9dc3bb20 branches/zip: Add missing out: comments to nullary functions. 2009-05-19 06:30:02 +00:00
marko
b10bc48d35 branches/zip: Add some missing out: comments to buf0buf.h, buf0buf.c. 2009-05-18 12:36:10 +00:00
marko
a512de6783 branches/zip: buf_validate(): Add missing out: comment. 2009-05-18 12:29:51 +00:00
marko
a0714b182c branches/zip: univ.i: Define REFMAN as the base URL of the
MySQL Reference Manual and use it in every string.
This fixes Issue .
2009-04-16 12:02:27 +00:00
inaam
b76aa20cbc branches/zip
SHOW ENGINE INNODB MUTEX shows all mutexes and rw_locks. This can
be overwhelming particularly when the buffer pool is very large
(note that each block in buffer pool has at least one mutex, one
rw_lock and an additional mutex if rw_lock does not use atomics).
With this patch status of following mutexes and rw-locks is not shown:

1) block->mutex
2) block->lock
3) block->lock->mutex (if applicable)
4) All other mutexes and rw-locks for which number of os-waits are zero

Addresses issue# 179 rb://99

Approved by: Marko
2009-03-25 17:18:33 +00:00
marko
d90bea085a branches/zip: Remove unneeded definitions and dependencies
from UNIV_HOTBACKUP builds.
2009-03-23 14:21:34 +00:00
marko
83e98148b5 branches/zip: buf_page_print(): Clean up the code #ifdef UNIV_HOTBACKUP. 2009-03-23 10:05:47 +00:00