mariadb/storage/innobase/include/buf0buddy.h
Marko Mäkelä cc89e5f94f MDEV-29445: Reimplement SET GLOBAL innodb_buffer_pool_size
We deprecate and ignore the parameter innodb_buffer_pool_chunk_size
and let the buffer pool size to be changed in arbitrary 1-megabyte
increments, all the way up to innodb_buffer_pool_size_max,
which must be specified at startup.

If innodb_buffer_pool_size_max is not specified, it will default to
twice the specified innodb_buffer_pool_size.

The buffer pool will be mapped in a contiguous memory area that
will be aligned and partitioned into extents of 8 MiB on 64-bit systems
and 2 MiB on 32-bit systems.

Within an extent, the first few innodb_page_size blocks contain
buf_block_t objects that will cover the page frames in the rest
of the extent. In this way, there is a trivial mapping between
page frames and block descriptors and we do not need any
lookup tables like buf_pool.zip_hash or buf_pool_t::chunk_t::map.

We will always allocate the same number of block descriptors for
an extent, even if we do not need all the buf_block_t in the last
extent in case the innodb_buffer_pool_size is not an integer multiple
of the of extents size.

The minimum innodb_buffer_pool_size is 256*5/4 pages. At the default
innodb_page_size=16k this corresponds to 5 MiB. However, now that the
innodb_buffer_pool_size includes the memory allocated for the block
descriptors, the minimum would be innodb_buffer_pool_size=6m.

Innodb_buffer_pool_resize_status: Remove. We will execute
buf_pool_t::resize() synchronously in the thread that is executing
SET GLOBAL innodb_buffer_pool_size. That operation will run until
it completes, or until a KILL statement is executed, the client
is disconnected, the buf_flush_page_cleaner() thread notices that
we are running out of memory, or the server is shut down.

my_large_virtual_alloc(): A new function, similar to my_large_malloc().
FIXME: On Microsoft Windows, let the caller know if large page allocation
was used. In that case, we must disallow buffer pool resizing.

buf_pool_t::create(), buf_pool_t::chunk_t::create(): Only initialize
the first page descriptor of each chunk.

buf_pool_t::lazy_allocate(): Lazily initialize a previously allocated
page descriptor and increase buf_pool.n_blocks, which must be below
buf_pool.n_blocks_alloc.

buf_pool_t::allocate(): Renamed from buf_LRU_get_free_only().

buf_pool_t::LRU_warned: Changed to Atomic_relaxed<bool>,
only to be modified by the buf_flush_page_cleaner() thread.

buf_pool_t::LRU_shrink(): Check if buffer pool shrinking needs
to process a buffer page.

buf_pool_t::resize(): Always zero out b->page.zip.data.
Failure to do so would cause crashes or corruption in
the test innodb.innodb_buffer_pool_resize due to
duplicated allocation in the buddy system.
Before tarting to shrink the buffer pool, run one batch of
buf_flush_page_cleaner() in order to prevent LRU_warn().
Abort shrinking if the buf_flush_page_cleaner() has LRU_warned.

buf_pool_t::first_to_withdraw: The first block descriptor that is
out of the bounds of the shrunk buffer pool.

buf_pool_t::withdrawn: The list of withdrawn blocks.
If buf_pool_t::resize() is aborted, we must be able to resurrect
the withdrawn blocks in the free list.

buf_pool_t::contains_zip(): Added a parameter for the
number of least significant pointer bits to disregard,
so that we can find any pointers to within a block
that is supposed to be free.

buf_pool_t::get_info(): Replaces buf_stats_get_pool_info().

innodb_init_param(): Refactored. We must first compute
srv_page_size_shift and then determine the valid bounds of
innodb_buffer_pool_size.

buf_buddy_shrink(): Replaces buf_buddy_realloc().
Part of the work is deferred to buf_buddy_condense_free(),
which is being executed when we are not holding any
buf_pool.page_hash latch.

buf_buddy_condense_free(): Do not relocate blocks.

buf_buddy_free_low(): Do not care about buffer pool shrinking.
This will be handled by buf_buddy_shrink() and
buf_buddy_condense_free().

buf_buddy_alloc_zip(): Assert !buf_pool.contains_zip()
when we are allocating from the binary buddy system.
Previously we were asserting this on multiple recursion levels.

buf_buddy_block_free(), buf_buddy_free_low():
Assert !buf_pool.contains_zip().

buf_buddy_alloc_from(): Remove the redundant parameter j.

buf_flush_LRU_list_batch(): Add the parameter shrinking.
If we are shrinking, invoke buf_pool_t::LRU_shrink() to see
if we must keep going.

buf_do_LRU_batch(): Skip buf_free_from_unzip_LRU_list_batch()
if we are shrinking the buffer pool. In that case, we want
to minimize the page relocations and just finish as quickly
as possible.

trx_purge_attach_undo_recs(): Limit purge_sys.n_pages_handled()
in every iteration, in case the buffer pool is being shrunk
in the middle of a purge batch.
2025-02-05 16:12:29 +02:00

89 lines
3.1 KiB
C

/*****************************************************************************
Copyright (c) 2006, 2016, Oracle and/or its affiliates. All Rights Reserved.
Copyright (c) 2018, 2020, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1335 USA
*****************************************************************************/
/**************************************************//**
@file include/buf0buddy.h
Binary buddy allocator for compressed pages
Created December 2006 by Marko Makela
*******************************************************/
#pragma once
#include "buf0types.h"
/**
@param[in] block size in bytes
@return index of buf_pool.zip_free[], or BUF_BUDDY_SIZES */
inline ulint buf_buddy_get_slot(ulint size) noexcept
{
ulint i;
ulint s;
ut_ad(ut_is_2pow(size));
ut_ad(size >= UNIV_ZIP_SIZE_MIN);
ut_ad(size <= srv_page_size);
for (i = 0, s = BUF_BUDDY_LOW; s < size; i++, s <<= 1) {
}
ut_ad(i <= BUF_BUDDY_SIZES);
return i;
}
/** Allocate a ROW_FORMAT=COMPRESSED block.
@param i index of buf_pool.zip_free[] or BUF_BUDDY_SIZES
@param lru assigned to true if buf_pool.mutex was temporarily released
@return allocated block, never NULL */
byte *buf_buddy_alloc_low(ulint i, bool *lru) noexcept MY_ATTRIBUTE((malloc));
/** Allocate a ROW_FORMAT=COMPRESSED block.
@param size compressed page size in bytes
@param lru assigned to true if buf_pool.mutex was temporarily released
@return allocated block, never NULL */
inline byte *buf_buddy_alloc(ulint size, bool *lru= nullptr) noexcept
{
return buf_buddy_alloc_low(buf_buddy_get_slot(size), lru);
}
/** Deallocate a block.
@param[in] buf block to be freed, must not be pointed to
by the buffer pool
@param[in] i index of buf_pool.zip_free[], or BUF_BUDDY_SIZES */
void buf_buddy_free_low(void* buf, ulint i) noexcept;
/** Deallocate a block.
@param[in] buf block to be freed, must not be pointed to
by the buffer pool
@param[in] size block size in bytes */
inline void buf_buddy_free(void* buf, ulint size) noexcept
{
buf_buddy_free_low(buf, buf_buddy_get_slot(size));
}
ATTRIBUTE_COLD MY_ATTRIBUTE((nonnull, warn_unused_result))
/** Reallocate a ROW_FORMAT=COMPRESSED page frame during buf_pool_t::resize().
@param bpage page descriptor covering a ROW_FORMAT=COMPRESSED page
@param block uncompressed block for storage
@return block
@retval nullptr if the block was consumed */
ATTRIBUTE_COLD
buf_block_t *buf_buddy_shrink(buf_page_t *bpage, buf_block_t *block) noexcept;
/** Combine all pairs of free buddies.
@param size the target innodb_buffer_pool_size */
ATTRIBUTE_COLD void buf_buddy_condense_free(size_t size) noexcept;