mirror of
https://github.com/MariaDB/server.git
synced 2025-01-17 20:42:30 +01:00
abfdf76bbf
From revision r5703 to r5716 Detailed revision comments: r5703 | marko | 2009-08-27 02:25:00 -0500 (Thu, 27 Aug 2009) | 41 lines branches/zip: Replace the constant 3/8 ratio that controls the LRU_old size with the settable global variable innodb_old_blocks_pct. The minimum and maximum values are 5 and 95 per cent, respectively. The default is 100*3/8, in line with the old behavior. ut_time_ms(): New utility function, to return the current time in milliseconds. TODO: Is there a more efficient timestamp function, such as rdtsc divided by a power of two? buf_LRU_old_threshold_ms: New variable, corresponding to innodb_old_blocks_time. The value 0 is the default behaviour: no timeout before making blocks 'new'. bpage->accessed, bpage->LRU_position, buf_pool->ulint_clock: Remove. bpage->access_time: New field, replacing bpage->accessed. Protected by buf_pool_mutex instead of bpage->mutex. Updated when a page is created or accessed the first time in the buffer pool. buf_LRU_old_ratio, innobase_old_blocks_pct: New variables, corresponding to innodb_old_blocks_pct buf_LRU_old_ratio_update(), innobase_old_blocks_pct_update(): Update functions for buf_LRU_old_ratio, innobase_old_blocks_pct. buf_page_peek_if_too_old(): Compare ut_time_ms() to bpage->access_time if buf_LRU_old_threshold_ms && bpage->old. Else observe buf_LRU_old_ratio and bpage->freed_page_clock. buf_pool_t: Add n_pages_made_young, n_pages_not_made_young, n_pages_made_young_old, n_pages_not_made_young, for statistics. buf_print(): Display buf_pool->n_pages_made_young, buf_pool->n_pages_not_made_young. This function is only for crash diagnostics. buf_print_io(): Display buf_pool->LRU_old_len and quantities derived from buf_pool->n_pages_made_young, buf_pool->n_pages_not_made_young. This function is invoked by SHOW ENGINE INNODB STATUS. rb://129 approved by Heikki Tuuri. This addresses Bug #45015. r5704 | marko | 2009-08-27 03:31:17 -0500 (Thu, 27 Aug 2009) | 32 lines branches/zip: Fix a critical bug in fast index creation that could corrupt the created indexes. row_merge(): Make "half" an in/out parameter. Determine the offset of half the output file. Copy the last blocks record-by-record instead of block-by-block, so that the records can be counted. Check that the input and output have matching n_rec. row_merge_sort(): Do not assume that two blocks of size N are merged into a block of size 2*N. The output block can be shorter than the input if the last page of each input block is almost empty. Use an accurate termination condition, based on the "half" computed by row_merge(). row_merge_read(), row_merge_write(), row_merge_blocks(): Add debug output. merge_file_t, row_merge_file_create(): Add n_rec, the number of records in the merge file. row_merge_read_clustered_index(): Update n_rec. row_merge_blocks(): Update and check n_rec. row_merge_blocks_copy(): New function, for copying the last blocks in row_merge(). Update and check n_rec. This bug was discovered with a user-supplied test case that creates an index where the initial temporary file is 249 one-megabyte blocks and the merged files become smaller. In the test, possible merge record sizes are 10, 18, and 26 bytes. rb://150 approved by Sunny Bains. This addresses Issue #320. r5705 | marko | 2009-08-27 06:56:24 -0500 (Thu, 27 Aug 2009) | 11 lines branches/zip: dict_index_find_cols(): On column name lookup failure, return DB_CORRUPTION (HA_ERR_CRASHED) instead of abnormally terminating the server. Also, disable the previously added diagnostic output to the error log, because mysql-test-run does not like extra output in the error log. (Bug #44571) dict_index_add_to_cache(): Handle errors from dict_index_find_cols(). mysql-test/innodb_bug44571.test: A test case for triggering the bug. rb://135 approved by Sunny Bains. r5706 | inaam | 2009-08-27 11:00:27 -0500 (Thu, 27 Aug 2009) | 20 lines branches/zip rb://147 Done away with following two status variables: innodb_buffer_pool_read_ahead_rnd innodb_buffer_pool_read_ahead_seq Introduced two new status variables: innodb_buffer_pool_read_ahead = number of pages read as part of readahead since server startup innodb_buffer_pool_read_ahead_evicted = number of pages that are read in as readahead but were evicted before ever being accessed since server startup i.e.: a measure of how badly our readahead is performing SHOW INNODB STATUS will show two extra numbers in buffer pool section: pages read ahead/sec and pages evicted without access/sec Approved by: Marko r5707 | inaam | 2009-08-27 11:20:35 -0500 (Thu, 27 Aug 2009) | 6 lines branches/zip Remove unused macros as we erased the random readahead code in r5703. Also fixed some comments. r5708 | inaam | 2009-08-27 17:43:32 -0500 (Thu, 27 Aug 2009) | 4 lines branches/zip Remove redundant TRUE : FALSE from the return statement r5709 | inaam | 2009-08-28 01:22:46 -0500 (Fri, 28 Aug 2009) | 5 lines branches/zip rb://152 Disable display of deprecated parameter innodb_file_io_threads in 'show variables'. r5714 | marko | 2009-08-31 01:10:10 -0500 (Mon, 31 Aug 2009) | 5 lines branches/zip: buf_chunk_not_freed(): Do not acquire block->mutex unless block->page.state == BUF_BLOCK_FILE_PAGE. Check that block->page.state makes sense. Approved by Sunny Bains over the IM. r5716 | vasil | 2009-08-31 02:47:49 -0500 (Mon, 31 Aug 2009) | 9 lines branches/zip: Fix Bug#46718 InnoDB plugin incompatible with gcc 4.1 (at least: on PPC): "Undefined symbol" by implementing our own check in plug.in instead of using the result from the check from MySQL because it is insufficient. Approved by: Marko (rb://154)
295 lines
11 KiB
C
295 lines
11 KiB
C
/*****************************************************************************
|
|
|
|
Copyright (c) 1995, 2009, Innobase Oy. All Rights Reserved.
|
|
|
|
This program is free software; you can redistribute it and/or modify it under
|
|
the terms of the GNU General Public License as published by the Free Software
|
|
Foundation; version 2 of the License.
|
|
|
|
This program is distributed in the hope that it will be useful, but WITHOUT
|
|
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
|
|
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License along with
|
|
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
|
|
Place, Suite 330, Boston, MA 02111-1307 USA
|
|
|
|
*****************************************************************************/
|
|
|
|
/**************************************************//**
|
|
@file include/buf0lru.h
|
|
The database buffer pool LRU replacement algorithm
|
|
|
|
Created 11/5/1995 Heikki Tuuri
|
|
*******************************************************/
|
|
|
|
#ifndef buf0lru_h
|
|
#define buf0lru_h
|
|
|
|
#include "univ.i"
|
|
#include "ut0byte.h"
|
|
#include "buf0types.h"
|
|
|
|
/** The return type of buf_LRU_free_block() */
|
|
enum buf_lru_free_block_status {
|
|
/** freed */
|
|
BUF_LRU_FREED = 0,
|
|
/** not freed because the caller asked to remove the
|
|
uncompressed frame but the control block cannot be
|
|
relocated */
|
|
BUF_LRU_CANNOT_RELOCATE,
|
|
/** not freed because of some other reason */
|
|
BUF_LRU_NOT_FREED
|
|
};
|
|
|
|
/******************************************************************//**
|
|
Tries to remove LRU flushed blocks from the end of the LRU list and put them
|
|
to the free list. This is beneficial for the efficiency of the insert buffer
|
|
operation, as flushed pages from non-unique non-clustered indexes are here
|
|
taken out of the buffer pool, and their inserts redirected to the insert
|
|
buffer. Otherwise, the flushed blocks could get modified again before read
|
|
operations need new buffer blocks, and the i/o work done in flushing would be
|
|
wasted. */
|
|
UNIV_INTERN
|
|
void
|
|
buf_LRU_try_free_flushed_blocks(void);
|
|
/*==================================*/
|
|
/******************************************************************//**
|
|
Returns TRUE if less than 25 % of the buffer pool is available. This can be
|
|
used in heuristics to prevent huge transactions eating up the whole buffer
|
|
pool for their locks.
|
|
@return TRUE if less than 25 % of buffer pool left */
|
|
UNIV_INTERN
|
|
ibool
|
|
buf_LRU_buf_pool_running_out(void);
|
|
/*==============================*/
|
|
|
|
/*#######################################################################
|
|
These are low-level functions
|
|
#########################################################################*/
|
|
|
|
/** Minimum LRU list length for which the LRU_old pointer is defined */
|
|
#define BUF_LRU_OLD_MIN_LEN 512 /* 8 megabytes of 16k pages */
|
|
|
|
/** Maximum LRU list search length in buf_flush_LRU_recommendation() */
|
|
#define BUF_LRU_FREE_SEARCH_LEN (5 + 2 * BUF_READ_AHEAD_AREA)
|
|
|
|
/******************************************************************//**
|
|
Invalidates all pages belonging to a given tablespace when we are deleting
|
|
the data file(s) of that tablespace. A PROBLEM: if readahead is being started,
|
|
what guarantees that it will not try to read in pages after this operation has
|
|
completed? */
|
|
UNIV_INTERN
|
|
void
|
|
buf_LRU_invalidate_tablespace(
|
|
/*==========================*/
|
|
ulint id); /*!< in: space id */
|
|
/********************************************************************//**
|
|
Insert a compressed block into buf_pool->zip_clean in the LRU order. */
|
|
UNIV_INTERN
|
|
void
|
|
buf_LRU_insert_zip_clean(
|
|
/*=====================*/
|
|
buf_page_t* bpage); /*!< in: pointer to the block in question */
|
|
|
|
/******************************************************************//**
|
|
Try to free a block. If bpage is a descriptor of a compressed-only
|
|
page, the descriptor object will be freed as well.
|
|
|
|
NOTE: If this function returns BUF_LRU_FREED, it will not temporarily
|
|
release buf_pool_mutex. Furthermore, the page frame will no longer be
|
|
accessible via bpage.
|
|
|
|
The caller must hold buf_pool_mutex and buf_page_get_mutex(bpage) and
|
|
release these two mutexes after the call. No other
|
|
buf_page_get_mutex() may be held when calling this function.
|
|
@return BUF_LRU_FREED if freed, BUF_LRU_CANNOT_RELOCATE or
|
|
BUF_LRU_NOT_FREED otherwise. */
|
|
UNIV_INTERN
|
|
enum buf_lru_free_block_status
|
|
buf_LRU_free_block(
|
|
/*===============*/
|
|
buf_page_t* bpage, /*!< in: block to be freed */
|
|
ibool zip, /*!< in: TRUE if should remove also the
|
|
compressed page of an uncompressed page */
|
|
ibool* buf_pool_mutex_released);
|
|
/*!< in: pointer to a variable that will
|
|
be assigned TRUE if buf_pool_mutex
|
|
was temporarily released, or NULL */
|
|
/******************************************************************//**
|
|
Try to free a replaceable block.
|
|
@return TRUE if found and freed */
|
|
UNIV_INTERN
|
|
ibool
|
|
buf_LRU_search_and_free_block(
|
|
/*==========================*/
|
|
ulint n_iterations); /*!< in: how many times this has been called
|
|
repeatedly without result: a high value means
|
|
that we should search farther; if
|
|
n_iterations < 10, then we search
|
|
n_iterations / 10 * buf_pool->curr_size
|
|
pages from the end of the LRU list; if
|
|
n_iterations < 5, then we will also search
|
|
n_iterations / 5 of the unzip_LRU list. */
|
|
/******************************************************************//**
|
|
Returns a free block from the buf_pool. The block is taken off the
|
|
free list. If it is empty, returns NULL.
|
|
@return a free control block, or NULL if the buf_block->free list is empty */
|
|
UNIV_INTERN
|
|
buf_block_t*
|
|
buf_LRU_get_free_only(void);
|
|
/*=======================*/
|
|
/******************************************************************//**
|
|
Returns a free block from the buf_pool. The block is taken off the
|
|
free list. If it is empty, blocks are moved from the end of the
|
|
LRU list to the free list.
|
|
@return the free control block, in state BUF_BLOCK_READY_FOR_USE */
|
|
UNIV_INTERN
|
|
buf_block_t*
|
|
buf_LRU_get_free_block(
|
|
/*===================*/
|
|
ulint zip_size); /*!< in: compressed page size in bytes,
|
|
or 0 if uncompressed tablespace */
|
|
|
|
/******************************************************************//**
|
|
Puts a block back to the free list. */
|
|
UNIV_INTERN
|
|
void
|
|
buf_LRU_block_free_non_file_page(
|
|
/*=============================*/
|
|
buf_block_t* block); /*!< in: block, must not contain a file page */
|
|
/******************************************************************//**
|
|
Adds a block to the LRU list. */
|
|
UNIV_INTERN
|
|
void
|
|
buf_LRU_add_block(
|
|
/*==============*/
|
|
buf_page_t* bpage, /*!< in: control block */
|
|
ibool old); /*!< in: TRUE if should be put to the old
|
|
blocks in the LRU list, else put to the
|
|
start; if the LRU list is very short, added to
|
|
the start regardless of this parameter */
|
|
/******************************************************************//**
|
|
Adds a block to the LRU list of decompressed zip pages. */
|
|
UNIV_INTERN
|
|
void
|
|
buf_unzip_LRU_add_block(
|
|
/*====================*/
|
|
buf_block_t* block, /*!< in: control block */
|
|
ibool old); /*!< in: TRUE if should be put to the end
|
|
of the list, else put to the start */
|
|
/******************************************************************//**
|
|
Moves a block to the start of the LRU list. */
|
|
UNIV_INTERN
|
|
void
|
|
buf_LRU_make_block_young(
|
|
/*=====================*/
|
|
buf_page_t* bpage); /*!< in: control block */
|
|
/******************************************************************//**
|
|
Moves a block to the end of the LRU list. */
|
|
UNIV_INTERN
|
|
void
|
|
buf_LRU_make_block_old(
|
|
/*===================*/
|
|
buf_page_t* bpage); /*!< in: control block */
|
|
/**********************************************************************//**
|
|
Updates buf_LRU_old_ratio.
|
|
@return updated old_pct */
|
|
UNIV_INTERN
|
|
uint
|
|
buf_LRU_old_ratio_update(
|
|
/*=====================*/
|
|
uint old_pct,/*!< in: Reserve this percentage of
|
|
the buffer pool for "old" blocks. */
|
|
ibool adjust);/*!< in: TRUE=adjust the LRU list;
|
|
FALSE=just assign buf_LRU_old_ratio
|
|
during the initialization of InnoDB */
|
|
/********************************************************************//**
|
|
Update the historical stats that we are collecting for LRU eviction
|
|
policy at the end of each interval. */
|
|
UNIV_INTERN
|
|
void
|
|
buf_LRU_stat_update(void);
|
|
/*=====================*/
|
|
|
|
#if defined UNIV_DEBUG || defined UNIV_BUF_DEBUG
|
|
/**********************************************************************//**
|
|
Validates the LRU list.
|
|
@return TRUE */
|
|
UNIV_INTERN
|
|
ibool
|
|
buf_LRU_validate(void);
|
|
/*==================*/
|
|
#endif /* UNIV_DEBUG || UNIV_BUF_DEBUG */
|
|
#if defined UNIV_DEBUG_PRINT || defined UNIV_DEBUG || defined UNIV_BUF_DEBUG
|
|
/**********************************************************************//**
|
|
Prints the LRU list. */
|
|
UNIV_INTERN
|
|
void
|
|
buf_LRU_print(void);
|
|
/*===============*/
|
|
#endif /* UNIV_DEBUG_PRINT || UNIV_DEBUG || UNIV_BUF_DEBUG */
|
|
|
|
/** @name Heuristics for detecting index scan @{ */
|
|
/** Reserve this much/BUF_LRU_OLD_RATIO_DIV of the buffer pool for
|
|
"old" blocks. Protected by buf_pool_mutex. */
|
|
extern uint buf_LRU_old_ratio;
|
|
/** The denominator of buf_LRU_old_ratio. */
|
|
#define BUF_LRU_OLD_RATIO_DIV 1024
|
|
/** Maximum value of buf_LRU_old_ratio.
|
|
@see buf_LRU_old_adjust_len
|
|
@see buf_LRU_old_ratio_update */
|
|
#define BUF_LRU_OLD_RATIO_MAX BUF_LRU_OLD_RATIO_DIV
|
|
/** Minimum value of buf_LRU_old_ratio.
|
|
@see buf_LRU_old_adjust_len
|
|
@see buf_LRU_old_ratio_update
|
|
The minimum must exceed
|
|
(BUF_LRU_OLD_TOLERANCE + 5) * BUF_LRU_OLD_RATIO_DIV / BUF_LRU_OLD_MIN_LEN. */
|
|
#define BUF_LRU_OLD_RATIO_MIN 51
|
|
|
|
#if BUF_LRU_OLD_RATIO_MIN >= BUF_LRU_OLD_RATIO_MAX
|
|
# error "BUF_LRU_OLD_RATIO_MIN >= BUF_LRU_OLD_RATIO_MAX"
|
|
#endif
|
|
#if BUF_LRU_OLD_RATIO_MAX > BUF_LRU_OLD_RATIO_DIV
|
|
# error "BUF_LRU_OLD_RATIO_MAX > BUF_LRU_OLD_RATIO_DIV"
|
|
#endif
|
|
|
|
/** Move blocks to "new" LRU list only if the first access was at
|
|
least this many milliseconds ago. Not protected by any mutex or latch. */
|
|
extern uint buf_LRU_old_threshold_ms;
|
|
/* @} */
|
|
|
|
/** @brief Statistics for selecting the LRU list for eviction.
|
|
|
|
These statistics are not 'of' LRU but 'for' LRU. We keep count of I/O
|
|
and page_zip_decompress() operations. Based on the statistics we decide
|
|
if we want to evict from buf_pool->unzip_LRU or buf_pool->LRU. */
|
|
struct buf_LRU_stat_struct
|
|
{
|
|
ulint io; /**< Counter of buffer pool I/O operations. */
|
|
ulint unzip; /**< Counter of page_zip_decompress operations. */
|
|
};
|
|
|
|
/** Statistics for selecting the LRU list for eviction. */
|
|
typedef struct buf_LRU_stat_struct buf_LRU_stat_t;
|
|
|
|
/** Current operation counters. Not protected by any mutex.
|
|
Cleared by buf_LRU_stat_update(). */
|
|
extern buf_LRU_stat_t buf_LRU_stat_cur;
|
|
|
|
/** Running sum of past values of buf_LRU_stat_cur.
|
|
Updated by buf_LRU_stat_update(). Protected by buf_pool_mutex. */
|
|
extern buf_LRU_stat_t buf_LRU_stat_sum;
|
|
|
|
/********************************************************************//**
|
|
Increments the I/O counter in buf_LRU_stat_cur. */
|
|
#define buf_LRU_stat_inc_io() buf_LRU_stat_cur.io++
|
|
/********************************************************************//**
|
|
Increments the page_zip_decompress() counter in buf_LRU_stat_cur. */
|
|
#define buf_LRU_stat_inc_unzip() buf_LRU_stat_cur.unzip++
|
|
|
|
#ifndef UNIV_NONINL
|
|
#include "buf0lru.ic"
|
|
#endif
|
|
|
|
#endif
|