there will always be enough space for two node pointer records in an
empty B-tree page. This was reported as Mantis issue #73.
page_zip_rec_needs_ext(): Add the parameter n_fields, for accurate
estimation of the compressed size of the data dictionary information.
Given that this function is only invoked for records on leaf pages,
require that there be enough space for one record in the compressed
page. We check elsewhere that there will be enough room for two node
pointer records on higher-level pages.
btr_cur_optimistic_insert(): Ensure that there will be enough room for
two node pointer records on an empty non-leaf page. The rule for
leaf-page records will be enforced by the callers of
page_zip_rec_needs_ext().
btr_cur_pessimistic_insert(): Remove the insufficient check that the
leaf page record should be compressible by itself. Instead, now we
require that two node pointer records fit on a non-leaf page, and one
record will fit in uncompressed form on the leaf page.
page_zip_write_header(), page_zip_write_rec(): Re-enable the debug
assertions that were violated by the insufficient check in
btr_cur_pessimistic_insert().
innodb_bug36172.test: Use a larger compressed page size.
buf_block_align() on a non-file page frame that was created in
btr_cur_pessimistic_insert(), to see if a record fits on a compressed
page by itself. These assertions caused an assertion failure in
buf_block_align() in innodb_bug36172.test.
page_zip_write_rec(), page_zip_write_header(): Remove the assertion
that calls buf_frame_get_page_zip().
INNODB_ZIP and INNODB_ZIP_RESET to
INNODB_COMPRESSION and INNODB_COMPRESSION_RESET,
and remove the statistics of the buddy system.
This change was discussed with Ken. It makes the tables shorter
and easier to understand. The removed data will be represented in
the tables INNODB_COMPRESSION_BUDDY and INNODB_COMPRESSION_BUDDY_RESET
that will be added later.
i_s_innodb_zip, i_s_innodb_zip_reset, i_s_zip_fields_info[],
i_s_zip_fill_low(), i_s_zip_fill(), i_s_zip_reset_fill(),
i_s_zip_init(), i_s_zip_reset_init(): Replace "zip" with "compression".
i_s_compression_fields_info[]: Remove "used", "free",
"relocated", "relocated_usec". In "compressed_usec" and "decompressed_usec",
replace microseconds with seconds ("usec" with "sec").
page_zip_decompress(): Correct a typo in the function comment.
PAGE_ZIP_SSIZE_BITS, PAGE_ZIP_NUM_SSIZE: New constants.
page_zip_stat_t, page_zip_stat: Statistics of the compression, grouped
by page size.
page_zip_simple_validate(): Assert that page_zip->ssize is reasonable.
lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return
after ut_error. On Windows, ut_error is not declared as "noreturn".
Add explicit type casts when assigning ulint to byte to get rid of
"possible loss of precision" warnings.
struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint
instead of ullint. 32 bits should be enough.
fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned
integers to longlong when calling Field::store(longlong, bool is_unsigned).
Otherwise, the compiler would implicitly convert them to double and
invoke Field::store(double) instead.
recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add():
Cast ib_uint64_t expressions to ulint to get rid of "possible loss of
precision" warnings. (There should not be any loss of precision in
these cases.)
log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t
instead of ulint, so that there won't be any potential loss of precision.
mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint.
OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE.
row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*)
in order to get rid of the bogus MSVC warning C4090, which has been reported
as MSVC bug 101661:
<http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661>
row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090,
drop a const qualifier.
blocks that contains uncompressed and compressed frames. This patch was
designed by Heikki and Inaam, implemented by Inaam, and refined and reviewed
by Marko and Sunny.
buf_buddy_n_frames, buf_buddy_min_n_frames, buf_buddy_max_n_frames: Remove.
buf_page_belongs_to_unzip_LRU(): New predicate:
bpage->zip.data && buf_page_get_state(bpage) == BUF_BLOCK_FILE_PAGE.
buf_pool_t, buf_block_t: Add the linked list unzip_LRU. A block in the
regular LRU list is in unzip_LRU iff buf_page_belongs_to_unzip_LRU() holds.
buf_LRU_free_block(): Add a third return value to refine the case
"cannot free the block".
buf_LRU_search_and_free_block(): Update the documentation to reflect the
implementation.
buf_LRU_stat_t, buf_LRU_stat_cur, buf_LRU_stat_sum, buf_LRU_stat_arr[]:
Statistics for the unzip_LRU algorithm.
buf_LRU_stat_update(): New function: Update the statistics. Called once
per second by srv_error_monitor_thread().
buf_LRU_validate(): Validate the unzip_LRU list as well.
buf_LRU_evict_from_unzip_LRU(): New predicate: Use the unzip_LRU before
falling back to the regular LRU?
buf_LRU_free_from_unzip_LRU_list(), buf_LRU_free_from_common_LRU_list():
Subfunctions of buf_LRU_search_and_free_block().
buf_LRU_search_and_free_block(): Reimplement. Try to evict an uncompressed
page from the unzip_LRU list before falling back to evicting an entire block
from the common LRU list.
buf_unzip_LRU_remove_block_if_needed(): New function.
buf_unzip_LRU_add_block(): New function: Add a block to the unzip_LRU list.
buf_buddy_relocated_duration[],
page_zip_compress_duration[]
page_zip_decompress_duration[]: Record the total duration of the operations.
buf_buddy_relocate(), page_zip_compress(), page_zip_decompress():
Add ut_time_us() instrumentation.
i_s_zip_fields_info[], i_s_zip_fill_low(): Move the columns containing
cumulated statistics last. Add relocated_usec, compressed_usec, and
decompressed_usec.
for the purpose of comparing different compression algorithms.
PAGE_ZIP_COMPRESS_DBG: New preprocessor condition, to see if deflate()
is wrapped.
page_zip_compress_log: Log file counter. If set to nonzero, logging
is enabled.
page_zip_compress_deflate(): Add the parameter logfile.
FILE_LOGFILE, LOGFILE: Macros for declaring and passing the parameter logfile.
page_zip_compress(): Open and close the logfile if needed. Write the
uncompressed page and the size of the compressed data. The data passed
to deflate() is written by the wrapper page_zip_compress_deflate().
symbols. Use it for all definitions of non-static variables and functions.
lexyy.c, make_flex.sh: Declare yylex as UNIV_INTERN, not static. It is
referenced from pars0grm.c.
Actually, according to
nm .libs/ha_innodb.so|grep -w '[ABCE-TVXYZ]'
the following symbols are still global:
* The vtable for class ha_innodb
* pars0grm.c: The function yyparse() and the variables yychar, yylval, yynerrs
The required changes to the Bison-generated file pars0grm.c will be addressed
in a separate commit, which will add a script similar to make_flex.sh.
The class ha_innodb is renamed from class ha_innobase by a #define. Thus,
there will be no clash with the builtin InnoDB. However, there will be some
overhead for invoking virtual methods of class ha_innodb. Ideas for making
the vtable hidden are welcome. -fvisibility=hidden is not available in GCC 3.
just one parameter: UNIV_PAGE_SIZE_SHIFT.
UNIV_PAGE_SIZE, BUF_BUDDY_SIZES: Define in terms of UNIV_PAGE_SIZE_SHIFT.
BUF_BUDDY_LOW_SHIFT: New macro, to simplify the definition of BUF_BUDDY_LOW
and BUF_BUDDY_SIZES.
PAGE_ZIP_DIR_SLOT_MASK: Relax the compile-time check. This bitmask must be
one less than a power of two, and at least UNIV_PAGE_SIZE - 1.
is an overlap between BLOB pointers and the modification log or the
zlib stream.
page_zip_decompress_clust_ext(): Remove the improper check. The
d_stream->avail_in cannot be decremented here, because we do not know
at this point if the record is deleted. No space is reserved for the
BLOB pointers in deleted records.
page_zip_decompress_clust(): Check for the overlap here, right before
copying the BLOB pointers.
page_zip_decompress_clust(): Also check that the target column is long
enough, and return FALSE instead of ut_ad() failure.
some decompression functions.
page_zip_apply_log_ext(), page_zip_apply_log(): Call page_zip_fail()
with appropriate diagnostics before returning NULL.
page_zip_decompress_node_ptrs(), page_zip_decompress_sec(),
page_zip_decompress_clust(): When detecting that the zlib stream
followed by the modification log overlaps the trailer, do not
let an assertion fail, but invoke page_zip_fail() and return FALSE.
Corrupt data should never lead into assertion failures in decompression
functions.
buf_page_get_release_on_io(): Removed this unused function.
ibuf_build_entry_from_ibuf_rec(): Justify why it is not necessary to
add system columns to the dummy table pointed to by the dummy secondary index.
page_zip_rec_set_deleted(): Add a page_zip_validate() assertion.
the wrapper macro page_zip_fail() for displaying error messages.
When the error output is enabled (at compile-time), a breakpoint
may be set in page_zip_fail_func to easily debug all decompression
errors in the context where they occur.
in page_zip_decompress().
page_zip_decompress_clust(), page_zip_decompress_clust_ext(): Zero-fill
the columns DB_TRX_ID and DB_ROLL_PTR on the uncompressed page.
page_zip_get_trailer_len(), page_zip_write_header(): Correct the
UNIV_MEM_ASSERT_RW() assertions.
page_zip_validate(): Read the validity bits of page, page_zip, and
page_zip->data.
page_zip_decompress(): Assert that the uncompressed page is completely defined.
page_zip_validate(): Assert that the compressed and uncompressed pages are
completely defined. Fetch the "valid" bits, so that they can be examined
when run under valgrind --db-attach=yes.
initialized, although Valgrind believes that some bits in the 7th or 8th
bytes from the end are uninitialized. (They might be, but the decompressor
should not care about those bits after encountering the end-of-stream marker
in the compressed bit stream.)
btr_cur_optimistic_insert(): On compressed tablespaces, check that both
the compressed and the uncompressed page are completely initialized in
the beginning of the function.
page_zip_compress(): After successful compression, check that the compressed
page is completely initialized.
page_zip_write_rec(), page_zip_write_blob_ptr(), page_zip_write_node_ptr(),
page_zip_write_trx_id_and_roll_ptr(), page_zip_clear_rec(),
page_zip_rec_set_deleted(), page_zip_rec_set_owned(), page_zip_dir_insert(),
page_zip_dir_delete(), page_zip_dir_add_slot(), page_zip_reorganize(),
page_zip_copy(), page_zip_get_trailer_len(), page_zip_write_header():
Assert that the complete contents of the compressed page is defined.
page_zip_compress(): Assert that the contents of the uncompressed page
is entirely initialized.
page_zip_decompress(): Assert that the contents of the compressed page
is entirely initialized. Assert that the uncompressed page is entirely
writeable. Flag the uncompressed page uninitialized in the beginning.
For some reason, GCC 4.2.1 ignores casts (for removing constness)
in calls to inline functions.
page_align(), ut_align_down(): Make the parameter const void*, but still
return a non-const pointer. This is ugly, but these functions cannot be
replaced with a const-preserving macro in a portable way, given that
the pointer argument is not always pointing to bytes.
buf_block_get_page_zip(): Implement as a const-preserving macro.
buf_frame_get_page_zip(), buf_block_align(): Add const qualifiers.
lock_rec_get_prev(): Silence GCC 4.2.1 warnings.
mlog_write_initial_log_record(), mlog_write_initial_log_record_fast(),
mtr_memo_contains(): Add const qualifier to the pointer.
page_header_get_ptr(): Rewrite as page_header_get_offs(), and
implement as a macro that calls this function.
mem_heap_zalloc() and mem_zalloc(), because calloc() in the C runtime
library takes two size parameters, not one.
mem_heap_zalloc(): Add debug assertions. Document that the return value
is never NULL.
of UT_SORT_FUNCTION_BODY is best done by defining SORT_FUN and CMP_FUN as
macros when needed. The solution of r1523 allows for only one extra parameter.
the insert buffer bitmap.
ibuf_set_free_bits_func(): Never disable redo logging.
ibuf_update_free_bits_zip(): Remove.
btr_page_reorganize_low(), page_zip_reorganize(): Do not update the insert
buffer bitmap. Instead, document that callers will have to take care of it,
and adapt the callers.
btr_compress(): On error, reset the insert buffer free bits.
btr_cur_insert_if_possible(): Do not modify the insert buffer bitmap.
btr_compress(), btr_cur_optimistic_insert(): On compressed pages,
reset the insert buffer bitmap. Document why.
btr_cur_update_alloc_zip(): Document why it is necessary and sufficient
to reset the insert buffer free bits.
btr_cur_update_in_place(), btr_cur_optimistic_update(),
btr_cur_pessimistic_update(): Update the free bits in the same
mini-transaction. Document that the mini-transaction must be
committed before latching any further pages. Verify that this
is the case in all execution paths.
row_ins_sec_index_entry_by_modify(), row_ins_clust_index_entry_by_modify(),
row_undo_mod_clust_low(): Because these functions call
btr_cur_update_in_place(), btr_cur_optimistic_update(), or
btr_cur_pessimistic_update(), document that the mini-transaction must be
committed before latching any further pages. Verify that this is the case
in all execution paths.
crash recovery is in progress. This avoids a hang when
btr_parse_page_reorganize(), called from an I/O handler thread,
attempts to acquire log_sys->mutex while it is being held by
the main thread (the one that runs innobase_init()). This change
was committed accidentally. It may be unsafe to clear
mtr.modifications, because buf_page_release() at mtr_commit() may
forget to put modified pages to the flush list.
Cleanup: Remove the "type" parameter from many ibuf functions.
Let the caller check that !dict_index_is_clust(). This should avoid
function calls and register spilling.
ibuf_set_free_bits_func(), ibuf_set_free_bits(): Remove the parameter "type".
ibuf_reset_free_bits_with_type(): Rename to ibuf_reset_free_bits().
Remove the parameter "type".
ibuf_update_free_bits_if_full(), ibuf_update_free_bits_zip(),
ibuf_update_free_bits_low(), ibuf_update_free_bits_for_two_pages_low():
Remove the parameter "index".
ha_innodb.cc: Add the columns COMPRESSED, COMPRESSED_OK, DECOMPRESSED
to INFORMATION_SCHEMA.INNODB_BUDDY.
page_zip_compress_count[], page_zip_compress_ok[]: New statistic counters,
incremented in page_zip_compress().
page_zip_decompress_count[]: New statistic counter,
incremented in page_zip_decompress().
failing insert. Reorganization will have been attempted in
page_cur_tuple_insert() or page_cur_rec_insert().
page_zip_reorganize(): Recompute the insert buffer free bits for
leaf pages of secondary indexes.
ibuf_data_enough_free_for_insert(): Simplify.
record does not contain externally stored columns.
page_zip_decompress_clust_ext(): New function for decompressing records
containing externally stored columns.
externally stored columns.
page_zip_compress_clust_ext(), page_zip_apply_log_ext():
New functions.
page_zip_compress_clust(), page_zip_apply_log(): Check rec_offs_any_extern()
and avoid invoking the costly loop in most cases.
Similar optimizations can be made in page_zip_decompress_clust() and
page_zip_write_rec().
by adding the REC_OFFS_EXTERNAL flag to rec_offs_base(offsets)[0].
This reduces the processor usage of page_zip_write_rec() by about 40%
in one test case. The code could be sped up further by testing
rec_offs_any_extern() outside of loops that check rec_offs_nth_extern().
The vast majority of records does not contain any externally stored columns.
page_cur_insert_rec_zip_reorg(): New function: Recompress or
reorganize a compressed page.
page_cur_insert_rec_zip(): New function: insert a record to
a compressed page.
page_cur_insert_rec_low(): Only handle inserts to uncompressed pages.
to tables.
dict_mem_table_add_col(): Add the parameter "heap" for temporary memory
allocation. Allow it and "name" to be NULL. These parameters are NULL
when creating dummy indexes.
dict_add_col_name(): Remove calls to ut_malloc() and ut_free().
dict_table_get_col_name(): Allow table->col_names to be NULL.
dict_table_add_system_columns(), dict_table_add_to_cache():
Add the parameter "heap".