that Valgrind will not complain about freed data structures that are
reachable via pointers. This addresses Bug #45992 and Bug #46656.
This patch is mostly based on changes copied from branches/embedded-1.0,
mainly c5432, c3439, c3134, c2994, c2978, but also some other code was
copied. Some added cleanup code is specific to MySQL/InnoDB.
rb://199 approved by Sunny Bains
This was reported as Issue #381.
buf_page_set_old(): Assert that blocks may only be set old if
buf_pool->LRU_old is initialized and buf_pool->LRU_old_len is nonzero.
Assert that buf_pool->LRU_old points to the block at the old/new boundary.
buf_LRU_old_adjust_len(): Invoke buf_page_set_old() after adjusting
buf_pool->LRU_old and buf_pool->LRU_old_len, in order not to violate
the added assertions.
buf_LRU_old_init(): Replace buf_page_set_old() with a direct
assignment to bpage->old, because these loops that initialize all the
blocks would temporarily violate the assertions about
buf_pool->LRU_old.
buf_LRU_remove_block(): When setting buf_pool->LRU_old = NULL, also
clear all bpage->old flags and set buf_pool->LRU_old_len = 0.
buf_LRU_add_block_to_end_low(), buf_LRU_add_block_low(): Move the
buf_page_set_old() call later in order not to violate the debug
assertions. If buf_pool->LRU_old is NULL, set old=FALSE.
buf_LRU_free_block(): Replace the UNIV_LRU_DEBUG assertion with a
dummy buf_page_set_old() call that performs more thorough checks.
buf_LRU_validate(): Do not tolerate garbage in buf_pool->LRU_old_len
even if buf_pool->LRU_old is NULL. Check that bpage->old is monotonic.
buf_relocate(): Make the UNIV_LRU_DEBUG checks stricter.
buf0buf.h: Revise the documentation of buf_page_t::old and
buf_pool_t::LRU_old_len.
in order to catch the buf_pool->LRU_old corruption reported in Issue #381.
buf_LRU_old_init(): Set the property from the tail towards the front
of the buf_pool->LRU list, in order not to trip the debug check.
Do not invalidate buffer pool while an LRU batch is active. Added
code to buf_pool_invalidate() to wait for the running batches to finish.
This patch also resets the state of buf_pool struct at invalidation. This
addresses the concern where buf_pool->freed_page_clock becomes non-zero
because we read in a system tablespace page for file format info at
startup.
Approved by: Marko
not even when restoring an uncompressed page after a compression failure.
btr_page_reorganize_low(): On compression failure, do not restore
those page header fields that should not be affected by the
reorganization. Instead, compare the fields.
page_zip_decompress(): Add the parameter ibool all, for copying all
page header fields. Pass the parameter all=TRUE on block read
completion, redo log application, and page_zip_validate(); pass
all=FALSE in all other cases.
page_zip_reorganize(): Do not restore the uncompressed page on
failure. It will be restored (to pre-modification state) by the
caller anyway.
rb://167, Issue #346
In case of pages that are not made young the counter is incremented
only when the page in question is 'old'. In case of pages that are
made young the counter is incremented in case of all pages. For apple
to apple comparison this patch changes the 'young-making' counter to
consider only 'old' blocks.
Approved by: Marko
addressing Bug #45015 (Issue #316), in r5703.
buf_page_set_accessed_make_young(): New auxiliary function, called by
buf_page_get_zip(), buf_page_get_gen(),
buf_page_optimistic_get_func(). Call ut_time_ms() outside of
buf_pool_mutex. Use cached access_time.
buf_page_set_accessed(): Add the parameter time_ms, so that
ut_time_ms() need not be called while holding buf_pool_mutex.
buf_page_optimistic_get_func(), buf_page_get_known_nowait(): Read
buf_page_t::access_time without holding buf_pool_mutex. This should be
OK, because the field is only used for heuristic purposes.
buf_page_peek_if_too_old(): If buf_pool->freed_page_clock == 0, return
FALSE, so that we will not waste time moving blocks in the LRU list in
the warm-up phase or when the workload fits in the buffer pool.
rb://156 approved by Sunny Bains
Fix a bug in manipulating the variable innodb_old_blocks_pct:
for any value assigned it got that value -1, except for 75. When
assigned 75, it got 75.
mysql> set global innodb_old_blocks_pct=15;
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like 'innodb_old_blocks_pct';
+-----------------------+-------+
| Variable_name | Value |
+-----------------------+-------+
| innodb_old_blocks_pct | 14 |
+-----------------------+-------+
1 row in set (0.00 sec)
mysql> set global innodb_old_blocks_pct=75;
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like 'innodb_old_blocks_pct';
+-----------------------+-------+
| Variable_name | Value |
+-----------------------+-------+
| innodb_old_blocks_pct | 75 |
+-----------------------+-------+
After the fix it gets exactly what was assigned.
Approved by: Marko (via IM)
Done away with following two status variables:
innodb_buffer_pool_read_ahead_rnd
innodb_buffer_pool_read_ahead_seq
Introduced two new status variables:
innodb_buffer_pool_read_ahead = number of pages read as part of
readahead since server startup
innodb_buffer_pool_read_ahead_evicted = number of pages that are read
in as readahead but were evicted before ever being accessed since
server startup i.e.: a measure of how badly our readahead is
performing
SHOW INNODB STATUS will show two extra numbers in buffer pool section:
pages read ahead/sec and pages evicted without access/sec
Approved by: Marko
size with the settable global variable innodb_old_blocks_pct. The
minimum and maximum values are 5 and 95 per cent, respectively. The
default is 100*3/8, in line with the old behavior.
ut_time_ms(): New utility function, to return the current time in
milliseconds. TODO: Is there a more efficient timestamp function, such
as rdtsc divided by a power of two?
buf_LRU_old_threshold_ms: New variable, corresponding to
innodb_old_blocks_time. The value 0 is the default behaviour: no
timeout before making blocks 'new'.
bpage->accessed, bpage->LRU_position, buf_pool->ulint_clock: Remove.
bpage->access_time: New field, replacing bpage->accessed. Protected by
buf_pool_mutex instead of bpage->mutex. Updated when a page is created
or accessed the first time in the buffer pool.
buf_LRU_old_ratio, innobase_old_blocks_pct: New variables,
corresponding to innodb_old_blocks_pct
buf_LRU_old_ratio_update(), innobase_old_blocks_pct_update(): Update
functions for buf_LRU_old_ratio, innobase_old_blocks_pct.
buf_page_peek_if_too_old(): Compare ut_time_ms() to bpage->access_time
if buf_LRU_old_threshold_ms && bpage->old. Else observe
buf_LRU_old_ratio and bpage->freed_page_clock.
buf_pool_t: Add n_pages_made_young, n_pages_not_made_young,
n_pages_made_young_old, n_pages_not_made_young, for statistics.
buf_print(): Display buf_pool->n_pages_made_young,
buf_pool->n_pages_not_made_young. This function is only for crash
diagnostics.
buf_print_io(): Display buf_pool->LRU_old_len and quantities derived
from buf_pool->n_pages_made_young, buf_pool->n_pages_not_made_young.
This function is invoked by SHOW ENGINE INNODB STATUS.
rb://129 approved by Heikki Tuuri. This addresses Bug #45015.
Change the read ahead parameter name to innodb_read_ahead_threshold.
Change the meaning of this parameter to signify the number of pages
that must be sequentially accessed for InnoDB to trigger a readahead
request.
Suggested by: Ken
This patch introduces heuristics based flushing rate of dirty pages to
avoid IO bursts at checkpoint.
1) log_capacity / log_generated per second gives us number of seconds
in which ALL dirty pages need to be flushed. Based on this rough
assumption we can say that
n_dirty_pages / (log_capacity / log_generation_rate) = desired_flush_rate
2) We use weighted averages (hard coded to 20 seconds) of
log_generation_rate to avoid resonance.
3) From the desired_flush_rate we subtract the number of pages that have
been flushed due to LRU flushing. That gives us pages that we should
flush as part of flush_list cleanup. And that is the number (capped by
maximum io_capacity) that we try to flush from the master thread.
Knobs:
======
innodb_adaptive_flushing: boolean, global, dynamic, default TRUE.
Since this heuristic is very experimental and has the potential to
dramatically change the IO pattern I think it is a good idea to leave a
knob to turn it off.
Approved by: Heikki
The current implementation is to try to flush the neighbors of every
page that we flush. This patch makes the following distinction:
1) If the flush is from flush_list AND
2) If the flush is intended to move the oldest_modification LSN ahead
(this happens when a user thread sees little space in the log file and
attempts to flush pages from the buffer pool so that a checkpoint can
be made)
THEN
Do not try to flush the neighbors. Just focus on flushing dirty pages at
the end of flush_list
Approved by: Heikki
Fix Mantis Issue#244 fix bug in linear read ahead (no check on access pattern)
The changes are:
1) Take into account access pattern when deciding whether or not to do linear
read ahead.
2) Expose a knob innodb_read_ahead_factor = [0-64] default (8), dynamic,
global to control linear read ahead behvior
3) Disable random read ahead. Keep the code for now.
Submitted by: Inaam (rb://122)
Approved by: Heikki (rb://122)
/*********************************
comments to Doxygen /** style like this:
/*****************************//**
This conversion was performed by the following command:
perl -i -e 'while(<ARGV>){if (m|^/\*{30}\**$|) {
s|\*{4}$|//**| if ++$com>1; $_ .= "\@file $ARGV\n" if $com==2}
print; if(eof){$.=0;undef $com}}' */*[ch] include/univ.i
This patch was created by running the following commands:
for i in */*[ch]; do doxygenify.pl $i; done
perl -i -pe 's#\*{3} \*/$#****/#' */*[ch]
where doxygenify.pl is
https://svn.innodb.com/svn/misc/trunk/tools/doxygenify.pl r510
Verified the consistency as follows:
(0) not too many /* in: */ or /* out: */ comments left in the code:
grep -l '/\*\s*\(in\|out\)[,:/]' */*[ch]
(1) no difference when ignoring blank lines, after stripping all
C90-style /* comments */, including multi-line ones, before and after
applying this patch:
perl -i -e 'undef $/;while(<ARGV>){s#/\*(.*?)\*/##gs;print}' */*[ch]
diff -I'^\s*$' --exclude .svn -ru TREE1 TREE2
(2) after stripping @return comments and !<, generated a diff and omitted
the hunks where /* out: */ function return comments were removed:
perl -i -e'undef $/;while(<ARGV>){s#!<##g;s#\n\@return\t.*?\*/# \*/#gs;print}'\
*/*[ch]
svn diff|
perl -e 'undef $/;$_=<>;s#\n-\s*/\* out[:,]([^\n]*?)(\n-[^\n]*?)*\*/##gs;print'
Some unintended changes were left. These will be removed in a
subsequent patch.
SHOW ENGINE INNODB MUTEX shows all mutexes and rw_locks. This can
be overwhelming particularly when the buffer pool is very large
(note that each block in buffer pool has at least one mutex, one
rw_lock and an additional mutex if rw_lock does not use atomics).
With this patch status of following mutexes and rw-locks is not shown:
1) block->mutex
2) block->lock
3) block->lock->mutex (if applicable)
4) All other mutexes and rw-locks for which number of os-waits are zero
Addresses issue# 179 rb://99
Approved by: Marko
* Remove old Innobase copyright lines from C source files
* Add a reference to the GPLv2 license as recommended by the lawyers
at Oracle Legal
[Step 1/28]
To the files touched by the Google patch from c4144 (excluding
include/os0sync.ic because later we removed Google code from that file):
* Remove the Google license
* Remove old Innobase copyright lines
* Add a reference to the Google license and to the GPLv2 license at the top,
as recommended by the lawyers at Oracle Legal.
This patch changes the innodb mutexes and rw_locks implementation.
On supported platforms it uses GCC builtin atomics. These changes
are based on the patch sent by Mark Callaghan of Google under BSD
license. More technical discussion can be found at rb://30
Approved by: Heikki
Both are minor changes:
1) Compiler error introduced in r4072: double ';' at the end.
2) Warning introduced in r3613: \mem\mem0pool.c(481) :
warning C4098: 'mem_area_free' : 'void' function returning a value
Approved by: Sunny (IM)