compressed pages from the buffer pool.
Makefile.am: Add buf0buddy.h, buf0buddy.ic.
buf/Makefile.am: Add buf0buddy.c.
Introduce the constants BUF_BUDDY_LOW and BUF_BUDDY_SIZES.
buf_pool_t: Add zip_mutex and the lists zip_clean and zip_free[].
buf_page_get_mutex(): Return &buf_pool->zip_mutex instead of NULL.
buf_buddy_get_offset(), buf_buddy_get(), buf_buddy_get_slot(),
buf_buddy_alloc_free(), buf_buddy_alloc_free_low(): New functions.
(state == BUF_BLOCK_ZIP_PAGE). Make use of buf_page_in_file()
and buf_page_get_mutex().
buf_block_get_newest_modification(): Rename to
buf_page_get_newest_modification().
Add m_nonempty for facilitating the test in page_zip_alloc(). This
reduces the combined size of the bit-fields to 32 bits. Thus,
sizeof(page_zip_des_t) == 2 machine words on 32-bit and wider systems.
decompression from page_zip_des_t to buf_page_t, because the fields
needed in compression are modified without holding the block mutex.
All writes to bit-fields sharing a machine word must be protected
by the same mutex or rw-lock.
buf_block_t. Move the fields "hash" and "file_page_was_freed" from
buf_block_t to buf_page_t.
buf_page_in_file(): New function, for checking block state in assertions.
storage size from 16 to 3 bits.
page_zip_get_size(), page_zip_set_size(): New functions.
Replace direct references to page_zip_des_t:size with calls to
buf_block_get_zip_size(), page_zip_get_size(), and page_zip_set_size().
and uncompressed buffer pool pages.
buf_block_t: Replace page_zip, space, and offset with buf_page_t page.
Replace some integers with bit-fields.
enum buf_block_state: Rename to buf_page_state. Add BUF_BLOCK_ZIP_PAGE.
page_zip_des_t: Add the field "state". Make the integer fields bit-fields.
page_zip_copy(): Document which fields are copied.
and block->space with buf_block_get_state(block), buf_block_get_page_no(block),
and buf_block_get_space(block).
enum buf_block_state: Replaces the #define'd buf_block_t.state values.
buf_block_get_state(): New function.
buf_block_get_frame(): Add __attribute__((const)).
hash index, because it might occupy the chunk we would like to free.
TODO: In btr_search_check_free_space_in_heap(), release the block if
btr_search_latch is not immediately available.
buf_pool_shrink(): Split from buf_pool_resize().
btr_search_disabled: New variable, similar to srv_use_adaptive_hash_indexes
that was removed earlier.
btr_search_disable(): New function: disable and purge the adaptive hash index.
btr_search_enable(): New function: enable the adaptive hash index.
ha_clear(): New function: Empty a hash table and free the memory heaps.
assume that non-file pages are free. After trying to free or flush
file pages, do not proceed to buf_chunk_free(), because the calls will
temporarily release buf_pool->mutex. Do not flush if there are non-free
blocks, because it would not achieve anything.
enclose all related debug code in #ifdef UNIV_DEBUG_FILE_ACCESSES.
This should have no effect on the behaviour, as the symbol is
not defined by default. It only reduces the size of buf_block_t
and removes some assignments and debug functions.
buf_LRU_block_free_non_file_page(): Deallocate block->page_zip.data
to avoid ut_a(!block->page_zip.data) in buf_chunk_free().
buf_chunk_free(): Add the assertion ut_a(!block->in_LRU_list).
buf_pool_resize(): When shrinking the buffer pool and there are
non-free blocks in the candidate chunk, free the clean blocks
and move the dirty blocks to the end of the LRU list and request a flush.
Proceed if the chunk becomes free, and retry otherwise.
mysql.patch: Patch to change or add variables to MySQL
innodb.patch: Patch to make the master thread poll requests to resize
the buffer pool.
Replace srv_pool_size and innobase_buffer_pool_size
with srv_buf_pool_size, srv_buf_pool_old_size,
and srv_buf_pool_curr_size.
Add buf_chunk_t, a collection of buf_block_t.
buf_LRU_block_remove_hashed_page(): Overwrite (space_id,page_no)
when freeing a buffer block. This will help catching non-file
pages being passed to buf_block_align().
On POSIX, use mmap() and munmap(). On Windows, use VirtualAlloc()
and VirtualFree(). Only on Netware, use ut_malloc_low() and ut_free().
The lower-level functions on POSIX and Windows allow InnoDB to return
memory to the operating system when the buffer pool is shrunk.
buf_pool_t: Remove n_frames, max_size, and blocks_of_frames.
The current buffer pool size is in curr_size.
buf_pool_init(): Remove parameter max_size.
buf_pool_get_max_size(), buf_pool_is_block(): Remove.
buf_block_align(): Do not assume that the buffer pool is allocated
in one chunk. Replace dependency on buf_pool->blocks_of_frames
with a call to buf_page_hash_get().
btr_pcur_restore_position(): Add const qualifiers.
buf_LRU_block_remove_hashed_page(): Reduce the amount
of buf_page_hash_get() calls and add a UNIV_UNLIKELY hint
to an assertion-like test.
buf0lru.c: Always #include "srv0srv.h"
buf_block_get_lock_mutex(), buf_frame_get_lock_mutex(): Remove.
mtr0log.ic: Remove #include "page0page.h" and replace the page_
functions with lower-level ones to break an #include cycle.
dict0dict.ic: Remove unnecessary #include "trx0undo.h" and "trx0sys.h"
that would create an #include cycle.
Replace buf_frame_t* guess with buf_block_t* guess in order to avoid
a buf_block_align() call.
trx_undo_t: Replace page_t* guess_page with buf_block_t* guess_block.
btr_search_t: Replace page_t* root_guess with buf_block_t* root_guess.
passed as TRUE.
Enclose hash_table_t::adaptive and buf_block_t::n_pointers in
#ifdef UNIV_DEBUG.
btr_search_drop_page_hash_index(): Enclose the corruption check
(which depends on buf_block_t::n_pointers) in #ifdef UNIV_DEBUG.
the symbol UNIV_DEBUG_PRINT, which was introduced in r729.
buf_LRU_print(), buf_print(): Replace #ifdef UNIV_DEBUG_PRINT
with #if defined UNIV_DEBUG || defined UNIV_DEBUG_PRINT.
ibuf_reset_free_bits(): Remove, as there already is a similar function
ibuf_reset_free_bits_with_type().
ibuf_reset_free_bits_with_type(), ibuf_set_free_bits(),
ibuf_update_free_bits_if_full(), btr_leaf_page_release(),
buf_page_make_young(): Replace page_t with buf_block_t.
btr_compress(): Replace btr_page_get() with btr_block_get().
with page_get_page_no() and page_get_space_id(). We want to avoid
buf_block_align() calls, and the page_no and space_id are now stamped
on the pages early on.
buf_flush_init_for_writing(): Remove parameters space, page_no.
fsp_init_file_page_low(): Wriet space_id and page_no to the page.
fil_create_new_single_table_tablespace(): Write space_id to the page.
dict_load_foreigns(): Enclose in #ifndef UNIV_HOTBACKUP.
fil_extend_tablespaces_to_stored_len(): Pass zip_size to fil_read().
buf_page_init_for_backup_restore(): Add parameter zip_size.
Enclose the declaration in buf0buf.h in #ifdef UNIV_HOTBACKUP.
recv_apply_log_recs_for_backup(): Replace the local variable "page"
with the local variable "block". Add local variable zip_size.
with FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID and FIL_PAGE_DATA. The doublewrite
buffer needs to read the space_id in order to determine the type of the page.
Because FIL_PAGE_TYPE could contain garbage in MySQL/InnoDB 5.0 and earlier
versions, we cannot trust fil_page_get_type(). Instead, we have to always
store the space_id at the same location. This modification wastes 12 bytes
per compressed BLOB page (1.2% on 1-kilobyte pages).
for more accurate Valgrind debugging.
univ.i: Introduce UNIV_DEBUG_VALGRIND, UNIV_MEM_VALID, and UNIV_MEM_INVALID.
buf_LRU_block_free_non_file_page(): Invalidate the buffer frame
with UNIV_MEM_INVALID().
buf_LRU_get_free_block(): Declare the buffer frame valid
with UNIV_MEM_VALID().
Other memory is allocated and deallocated via malloc() and free(),
which are already overridden by Valgrind. Without the added
instrumentation, accesses to free pages in the buffer pool cannot
be caught.
The diagnostics could probably be improved further by declaring all
non-latched buffer frames invalid.
buf_LRU_get_free_block(): When zip_size changes, initialize all fields
of page_zip. This avoids an assertion failure in page_create_zip() when
a block with an originally larger zip_size is reallocated.
fsp_get_space_header(): Assert that the stored space id matches.
xdes_get_state(): Assert that the state is valid.
dict_load_table(): Initialize table->flags with zip_size.
mlog_parse_nbytes(), mlog_parse_string(): Add parameter page_zip and
write the changes also to the compressed page if one is specified.
Assert that these functions are not called on FIL_PAGE_INDEX pages.
buf_page_io_complete(): Replace block->frame with frame where appropriate.
recv_parse_or_apply_log_rec_body(): Add ut_a(!page_zip) where appropriate.
page_parse_delete_rec_list(): Add parameter page_zip.
buf_page_io_complete(): On FIL_PAGE_TYPE_ZBLOB (compressed BLOB pages),
read the space_id from a different location.
page_zip_compress(), page_zip_write_rec(), page_zip_write_blob_ptr():
Replace page_simple_validate_new() with page_validate().
page_zip_clear_rec(): When running out of log space, do not attempt to
recompress the page, because the directory slots might be unbalanced and
the page_validate() assertion in page_zip_compress() would fail.
Instead, clear the BLOB pointers of the deleted record on the
uncompressed page, so that page_zip_validate() will succeed.
page_zip_validate(): Remove the comment about page_zip_clear_rec().
A mismatch always indicates a serious inconsistency.
buf_flush_init_for_writing(): On FIL_PAGE_TYPE_ZBLOB, write to
page_zip->data instead of page.
page_zip_write_rec(), page_zip_write_blob_ptr(), page_zip_write_node_ptr():
Add ut_ad(page_simple_validate_new()).
fil_read(), fil_write(): Make these inlined functions in fil0fil.c.
fil_write_lsn_and_arch_no_to_file(): Remove the parameter space_id and
note that this function is to be called on the system tablespace, which
is uncompressed.
page_zip instead of calling fil_space_get_zip_size(). In
fil_create_new_single_table_tablespace(), the table space has not yet
been created. Handle also FIL_PAGE_TYPE_ALLOCATED.
os_aio_simulated_handle(): Temporarily disable os_file_check_page_trailers(),
which cannot be invoked on compressed pages.
dict_table_add_system_columns(): New function, split from
dict_table_add_to_cache().
mlog_parse_index(): Add system columns to the dummy table and identify
DB_TRX_ID and DB_ROLL_PTR in the dummy index.
buf_LRU_get_free_block(): Note that page_zip->data should be allocated from
an aligned memory pool.
buf_flush_buffered_writes(): Write compressed pages to disk.
buf_flush_post_to_doublewrite_buf(): Copy compressed pages to the
doublewrite buffer. Zero fill any excess space.
buf_flush_init_for_writing(): Treat all compressed pages the same.
buf_read_page_low(): Read compressed pages from disk.
buf_page_io_complete(): Process compressed pages.
trx_sys_doublewrite_init_or_restore_page(): Process compressed pages.
mlog_write_initial_log_record_fast(): Enable a debug printout
#ifdef UNIV_LOG_DEBUG.
fsp_header_init(), fsp_fill_free_list(): Pass the compressed page size
to buf_page_create().
page_zip_compress_write_log(): Flatten the if-else if-else logic.
page_zip_parse_write_blob_ptr(): Do not test page_zip if page==NULL.
page_zip_parse_write_node_ptr(): Do not test page_zip if page==NULL.
Invoke mlog_close() correctly.
row_sel_store_row_id_to_prebuilt(): Add UNIV_UNLIKELY hint to an
assertion-like test.
a few places accordingly.
os_aio_simulated_handle(): Add TODO comments about skipping the write
checks for compressed pages.
dict_create_sys_tables_tuple(): Write the compressed page size to
the TYPE column.
open_or_create_data_files(): Simplify the fil_node_create() call.
fil_node_create(): Do not touch space->zip_size. It was already initialized
by fil_space_create().
fil_reset_too_high_lsns(), buf_flush_buffered_writes(): Add TODO comment
about compressed pages.
buf_flush_init_for_writing(): Handle pages of type FIL_PAGE_INODE,
FIL_PAGE_IBUF_BITMAP, and FIL_PAGE_TYPE_FSP_HDR as uncompressed ones.
btr_root_raise_and_insert(): When copying root to new_page byte for byte,
restore the page number of new_page afterwards.
buf_flush_init_for_writing(): For FIL_PAGE_INDEX, write the page number
and space id also to the uncompressed page.
Introduce FIL_PAGE_ZBLOB_DATA as a synonym for FIL_PAGE_FILE_FLUSH_LSN.
btr_store_big_rec_extern_fields(): Make the assertion about
dict_table_zip_size() more accurate.
buf_LRU_get_free_block(), buf_block_alloc(): Add parameter zip_size.
buf_calc_zblob_page_checksum(): Remove. Replace with page_zip_calc_checksum().
buf_page_init(): Remove parameter zip_size.
buf_page_io_complete(): Add a placeholder for handling compressed pages.
trx_doublewrite_page_inside(): Remove redundant function.
page_zip_write_rec(): Relax an overly tight assertion about blob_no.
buf_page_print(): Print also compressed pages. Add parameter zip_size.
buf_flush_init_for_writing(): Stamp the fields on a compressed B-tree index
page.
Add the header field FIL_PAGE_ZBLOB_SPACE_ID as an alias of FIL_PAGE_PREV.
page_zip_calc_checksum(): New function.
page_zip_compress(): Avoid copying the fields that are written in
buf_flush_init_for_writing().
page_zip_header_cmp(): New function for comparing those fields of the
page header that will not be written in buf_flush_init_for_writing().
buf_flush_init_for_writing(): Calculate the checksum with the actual zip_size.
buf_calc_zblob_page_checksum(): Skip the field FIL_PAGE_SPACE_OR_CHKSUM.
trx_sys_doublewrite_init_or_restore_page(): Use the actual zip_size.
page_cur_insert_rec_low(): If page_zip_alloc() fails, try compressing the
whole page afterwards.
btr_page_reorganize_low(): Rename new_page to temp_page.
btr_store_big_rec_extern_fields(): FIL_PAGE_TYPE is 2 bytes, not 4.
buf_page_init(), buf_page_create(), buf_read_page_low(),
buf_page_init_for_read(): Add parameter zip_size.
buf_page_init_for_backup_restore(),
recv_apply_log_recs_for_backup(): Enclose in #ifdef UNIV_HOTBACKUP.
Enclose some debug code in #ifdef UNIV_LOG_REPLICATE.
page_zip_write_header_log(): Replace page_zip with a pointer to
the uncompressed page.
page_zip_write_rec(): Relax an assertion about blob_no + n_ext.
page_copy_rec_list_to_created_page_write_log(): Allow logging to be disabled.
and to the file space header (FSP_PAGE_ZIP_SIZE, renamed from
FSP_LOWEST_NO_WRITE).
fil_space_struct: Add zip_size.
dict_table_struct: Embed zip_size in flags.
dict_table_zip_size(): Infer zip_size from table->flags.
dict_sys_tables_get_zip_size(): Read zip_size from SYS_TABLES.TYPE.
fil_space_get_zip_size(): Read zip_size from the file space header.
Add the redo log entry type MLOG_ZIP_FILE_CREATE.
dict_mem_table_create(): Account for DICT_TF_COMPRESSED in a debug assertion.
btr_store_big_rec_extern_fields(), btr_free_externally_stored_field(),
btr_copy_externally_stored_field(): Implement the disk format for
compressed BLOB pages.
btr_copy_externally_stored_field(): Improve error reporting and handling
when decompressing BLOB pages.
buf_flush_init_for_writing(), buf_page_is_corrupted(), buf_page_print():
Account for compressed BLOB pages (FIL_PAGE_TYPE_ZBLOB).
buf_calc_zblob_page_checksum(): New function.
btr_create(): page_zip_compress() returns FALSE on failure.
page_zip_write_header(): Write to page_zip->data[] instead of page_zip[].
buf_flush_init_for_writing(): Add parameter page_zip and set the fields
also in the header of the compressed page.
btr_cur_search_to_nth_level(): Add ut_ad() on page_zip_validate().
that will require complete index information.
dict_create_index_step(): invoke dict_index_add_to_cache() before btr_create()
dict_index_remove_from_cache(): make public
dict_index_get_if_in_cache_low(): new function, for holding dict_sys->mutex
buf_flush_init_for_writing(): remove the temporary hook to page_zip_compress()
page_create(): add temporary hook to page_zip_compress()
buf_flush_init_for_writing(): The reported dense page directory size was
4 bytes too much. Subtract 2 (infimum and supremum) from n_heap.
page_zip_decompress(): When decompressing the last user record, only set
heap_no and the status bits if there is data to decode, i.e., there
are user records on the page.
buf_flush_init_for_writing(): Improve the diagnostics and make the
condition for skipping pages accurate.
univ.i: Introduce UNIV_ZIP_DEBUG for enabling some page_zip_validate() tests.
page0zip.h, page0zip.c: Define and use page_zip_validate() in
page_zip_compress() and page_zip_write() if UNIV_ZIP_DEBUG or UNIV_DEBUG
is defined.
Before the speedc test was interrupted, 121,765 B-tree pages were written.
buf_flush_init_for_writing(): Do not compress other than B-tree pages
outside the system tablespace. Report non-B-tree pages.
page_zip_decompress(): Clear the unused heap space on the uncompressed page,
so that the whole buffer for the uncompressed page will be initialized and
page_zip_validate() will always succeed.
A page with multiple records or deleted records still does not compress
or decompress properly.
buf_flush_init_for_writing(): Initialize block->page_zip properly so that all
assertions in page0zip can be enabled.
page_zip_decompress(): Note that corrupt data should not lead to assertions.
page_zip_dir_set(): Correct the interface. Fix off-by-one error.
page_zip_dir_get(): Fix off-by-one error.
page0zip.c: Replace n_heap with n_dense and add comments about
the infimum and supremum records whenever we subtract 2 from heap_no.
Fix some programming errors.
buf0flu.c: Allocate the temporary buffer from buf_frame_alloc().
page_zip_simple_validate(): Do not assert page_zip->m_start >= PAGE_DATA.
page_zip_compress(): Replace some assertions with page_zip_simple_validate(),
and do not assert anything about page_zip->data contents.
page_zip_validate(): Do not compare the page trailer bytes.
page_zip_write(): Assert that the entire page headers match and
that page_zip->m_start >= PAGE_DATA.