page_zip_hexdump_func(): New function, to dump a block of data.
ut_print_buf() would dump everything on a single line, which is hard
to read.
page_zip_hexdump(): Wrapper macro for page_zip_hexdump_func().
page_zip_validate(): dump page_zip, page_zip->data, page, temp_page if !valid.
fields that are related to the records stored in the page.
page_zip_copy() is a fall-back method in certain B-tree operations
(tree compression, splitting or merging nodes). The contents of a
page may fit in the compressed page frame when it has been modified in
a certain sequence, but not when the page is recompressed. Sometimes,
copying all or part of the records to an empty page could fail because
of compression overflow. In such cases, we copy the compressed and
uncompressed pages bit for bit and delete any unwanted records from
the copy. (Deletion is guaranteed to succeed.) The method
page_zip_copy() is invoked very rarely.
In one case, page_zip_copy() was called in btr_lift_page_up() to move
the records to the root page of the B-tree. Because page_zip_copy()
copied all B-tree page header fields, it overwrote the file segment
header fields PAGE_BTR_SEG_LEAF and PAGE_BTR_SEG_TOP. This is the
probable cause of the corruption that was reported as Mantis issue #63
and others.
there will always be enough space for two node pointer records in an
empty B-tree page. This was reported as Mantis issue #73.
page_zip_rec_needs_ext(): Add the parameter n_fields, for accurate
estimation of the compressed size of the data dictionary information.
Given that this function is only invoked for records on leaf pages,
require that there be enough space for one record in the compressed
page. We check elsewhere that there will be enough room for two node
pointer records on higher-level pages.
btr_cur_optimistic_insert(): Ensure that there will be enough room for
two node pointer records on an empty non-leaf page. The rule for
leaf-page records will be enforced by the callers of
page_zip_rec_needs_ext().
btr_cur_pessimistic_insert(): Remove the insufficient check that the
leaf page record should be compressible by itself. Instead, now we
require that two node pointer records fit on a non-leaf page, and one
record will fit in uncompressed form on the leaf page.
page_zip_write_header(), page_zip_write_rec(): Re-enable the debug
assertions that were violated by the insufficient check in
btr_cur_pessimistic_insert().
innodb_bug36172.test: Use a larger compressed page size.
buf_block_align() on a non-file page frame that was created in
btr_cur_pessimistic_insert(), to see if a record fits on a compressed
page by itself. These assertions caused an assertion failure in
buf_block_align() in innodb_bug36172.test.
page_zip_write_rec(), page_zip_write_header(): Remove the assertion
that calls buf_frame_get_page_zip().
INNODB_ZIP and INNODB_ZIP_RESET to
INNODB_COMPRESSION and INNODB_COMPRESSION_RESET,
and remove the statistics of the buddy system.
This change was discussed with Ken. It makes the tables shorter
and easier to understand. The removed data will be represented in
the tables INNODB_COMPRESSION_BUDDY and INNODB_COMPRESSION_BUDDY_RESET
that will be added later.
i_s_innodb_zip, i_s_innodb_zip_reset, i_s_zip_fields_info[],
i_s_zip_fill_low(), i_s_zip_fill(), i_s_zip_reset_fill(),
i_s_zip_init(), i_s_zip_reset_init(): Replace "zip" with "compression".
i_s_compression_fields_info[]: Remove "used", "free",
"relocated", "relocated_usec". In "compressed_usec" and "decompressed_usec",
replace microseconds with seconds ("usec" with "sec").
page_zip_decompress(): Correct a typo in the function comment.
PAGE_ZIP_SSIZE_BITS, PAGE_ZIP_NUM_SSIZE: New constants.
page_zip_stat_t, page_zip_stat: Statistics of the compression, grouped
by page size.
page_zip_simple_validate(): Assert that page_zip->ssize is reasonable.
lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return
after ut_error. On Windows, ut_error is not declared as "noreturn".
Add explicit type casts when assigning ulint to byte to get rid of
"possible loss of precision" warnings.
struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint
instead of ullint. 32 bits should be enough.
fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned
integers to longlong when calling Field::store(longlong, bool is_unsigned).
Otherwise, the compiler would implicitly convert them to double and
invoke Field::store(double) instead.
recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add():
Cast ib_uint64_t expressions to ulint to get rid of "possible loss of
precision" warnings. (There should not be any loss of precision in
these cases.)
log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t
instead of ulint, so that there won't be any potential loss of precision.
mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint.
OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE.
row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*)
in order to get rid of the bogus MSVC warning C4090, which has been reported
as MSVC bug 101661:
<http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661>
row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090,
drop a const qualifier.
blocks that contains uncompressed and compressed frames. This patch was
designed by Heikki and Inaam, implemented by Inaam, and refined and reviewed
by Marko and Sunny.
buf_buddy_n_frames, buf_buddy_min_n_frames, buf_buddy_max_n_frames: Remove.
buf_page_belongs_to_unzip_LRU(): New predicate:
bpage->zip.data && buf_page_get_state(bpage) == BUF_BLOCK_FILE_PAGE.
buf_pool_t, buf_block_t: Add the linked list unzip_LRU. A block in the
regular LRU list is in unzip_LRU iff buf_page_belongs_to_unzip_LRU() holds.
buf_LRU_free_block(): Add a third return value to refine the case
"cannot free the block".
buf_LRU_search_and_free_block(): Update the documentation to reflect the
implementation.
buf_LRU_stat_t, buf_LRU_stat_cur, buf_LRU_stat_sum, buf_LRU_stat_arr[]:
Statistics for the unzip_LRU algorithm.
buf_LRU_stat_update(): New function: Update the statistics. Called once
per second by srv_error_monitor_thread().
buf_LRU_validate(): Validate the unzip_LRU list as well.
buf_LRU_evict_from_unzip_LRU(): New predicate: Use the unzip_LRU before
falling back to the regular LRU?
buf_LRU_free_from_unzip_LRU_list(), buf_LRU_free_from_common_LRU_list():
Subfunctions of buf_LRU_search_and_free_block().
buf_LRU_search_and_free_block(): Reimplement. Try to evict an uncompressed
page from the unzip_LRU list before falling back to evicting an entire block
from the common LRU list.
buf_unzip_LRU_remove_block_if_needed(): New function.
buf_unzip_LRU_add_block(): New function: Add a block to the unzip_LRU list.
buf_buddy_relocated_duration[],
page_zip_compress_duration[]
page_zip_decompress_duration[]: Record the total duration of the operations.
buf_buddy_relocate(), page_zip_compress(), page_zip_decompress():
Add ut_time_us() instrumentation.
i_s_zip_fields_info[], i_s_zip_fill_low(): Move the columns containing
cumulated statistics last. Add relocated_usec, compressed_usec, and
decompressed_usec.
for the purpose of comparing different compression algorithms.
PAGE_ZIP_COMPRESS_DBG: New preprocessor condition, to see if deflate()
is wrapped.
page_zip_compress_log: Log file counter. If set to nonzero, logging
is enabled.
page_zip_compress_deflate(): Add the parameter logfile.
FILE_LOGFILE, LOGFILE: Macros for declaring and passing the parameter logfile.
page_zip_compress(): Open and close the logfile if needed. Write the
uncompressed page and the size of the compressed data. The data passed
to deflate() is written by the wrapper page_zip_compress_deflate().
symbols. Use it for all definitions of non-static variables and functions.
lexyy.c, make_flex.sh: Declare yylex as UNIV_INTERN, not static. It is
referenced from pars0grm.c.
Actually, according to
nm .libs/ha_innodb.so|grep -w '[ABCE-TVXYZ]'
the following symbols are still global:
* The vtable for class ha_innodb
* pars0grm.c: The function yyparse() and the variables yychar, yylval, yynerrs
The required changes to the Bison-generated file pars0grm.c will be addressed
in a separate commit, which will add a script similar to make_flex.sh.
The class ha_innodb is renamed from class ha_innobase by a #define. Thus,
there will be no clash with the builtin InnoDB. However, there will be some
overhead for invoking virtual methods of class ha_innodb. Ideas for making
the vtable hidden are welcome. -fvisibility=hidden is not available in GCC 3.
just one parameter: UNIV_PAGE_SIZE_SHIFT.
UNIV_PAGE_SIZE, BUF_BUDDY_SIZES: Define in terms of UNIV_PAGE_SIZE_SHIFT.
BUF_BUDDY_LOW_SHIFT: New macro, to simplify the definition of BUF_BUDDY_LOW
and BUF_BUDDY_SIZES.
PAGE_ZIP_DIR_SLOT_MASK: Relax the compile-time check. This bitmask must be
one less than a power of two, and at least UNIV_PAGE_SIZE - 1.
is an overlap between BLOB pointers and the modification log or the
zlib stream.
page_zip_decompress_clust_ext(): Remove the improper check. The
d_stream->avail_in cannot be decremented here, because we do not know
at this point if the record is deleted. No space is reserved for the
BLOB pointers in deleted records.
page_zip_decompress_clust(): Check for the overlap here, right before
copying the BLOB pointers.
page_zip_decompress_clust(): Also check that the target column is long
enough, and return FALSE instead of ut_ad() failure.
some decompression functions.
page_zip_apply_log_ext(), page_zip_apply_log(): Call page_zip_fail()
with appropriate diagnostics before returning NULL.
page_zip_decompress_node_ptrs(), page_zip_decompress_sec(),
page_zip_decompress_clust(): When detecting that the zlib stream
followed by the modification log overlaps the trailer, do not
let an assertion fail, but invoke page_zip_fail() and return FALSE.
Corrupt data should never lead into assertion failures in decompression
functions.
buf_page_get_release_on_io(): Removed this unused function.
ibuf_build_entry_from_ibuf_rec(): Justify why it is not necessary to
add system columns to the dummy table pointed to by the dummy secondary index.
page_zip_rec_set_deleted(): Add a page_zip_validate() assertion.
the wrapper macro page_zip_fail() for displaying error messages.
When the error output is enabled (at compile-time), a breakpoint
may be set in page_zip_fail_func to easily debug all decompression
errors in the context where they occur.
in page_zip_decompress().
page_zip_decompress_clust(), page_zip_decompress_clust_ext(): Zero-fill
the columns DB_TRX_ID and DB_ROLL_PTR on the uncompressed page.
page_zip_get_trailer_len(), page_zip_write_header(): Correct the
UNIV_MEM_ASSERT_RW() assertions.
page_zip_validate(): Read the validity bits of page, page_zip, and
page_zip->data.
page_zip_decompress(): Assert that the uncompressed page is completely defined.
page_zip_validate(): Assert that the compressed and uncompressed pages are
completely defined. Fetch the "valid" bits, so that they can be examined
when run under valgrind --db-attach=yes.
initialized, although Valgrind believes that some bits in the 7th or 8th
bytes from the end are uninitialized. (They might be, but the decompressor
should not care about those bits after encountering the end-of-stream marker
in the compressed bit stream.)
btr_cur_optimistic_insert(): On compressed tablespaces, check that both
the compressed and the uncompressed page are completely initialized in
the beginning of the function.
page_zip_compress(): After successful compression, check that the compressed
page is completely initialized.
page_zip_write_rec(), page_zip_write_blob_ptr(), page_zip_write_node_ptr(),
page_zip_write_trx_id_and_roll_ptr(), page_zip_clear_rec(),
page_zip_rec_set_deleted(), page_zip_rec_set_owned(), page_zip_dir_insert(),
page_zip_dir_delete(), page_zip_dir_add_slot(), page_zip_reorganize(),
page_zip_copy(), page_zip_get_trailer_len(), page_zip_write_header():
Assert that the complete contents of the compressed page is defined.
page_zip_compress(): Assert that the contents of the uncompressed page
is entirely initialized.
page_zip_decompress(): Assert that the contents of the compressed page
is entirely initialized. Assert that the uncompressed page is entirely
writeable. Flag the uncompressed page uninitialized in the beginning.
page_cur_set_before_first(), page_cur_set_after_last(),
page_cur_position(): Add const qualifiers to buf_block_t and rec.
A better solution would be to define a const_page_cur_t and a
set of accessors, but it would lead to severe code duplication.
page_rec_get_n_recs_before(): Add const qualifiers.
page_dir_get_nth_slot(): Define as a const-preserving macro.
page_dir_slot_get_rec(), page_dir_slot_get_n_owned(),
page_dir_find_owner_slot(), page_check_dir(): Add const qualifiers.
page_rec_get_next_low(): Add const qualifiers.
page_rec_get_next_const(), page_rec_get_prev_const(): New functions,
based on the const-less page_rec_get_next() and page_rec_get_prev().
page_cur_get_page(), page_cur_get_block(), page_cur_get_page_zip(),
page_cur_get_rec(): Define as const-preserving macros.
page_cur_try_search_shortcut(), page_cur_search_with_match():
Add const qualifiers.
buf_page_get_mutex(): Add a const qualifier to buf_page_t*.
rec_get_next_ptr_const(): Const variant of rec_get_next_ptr().
dtuple_validate(): Detect uninitialized data.
page_cur_insert_rec_low(), page_cur_insert_rec_zip(): Assert that the
record being inserted is valid before and after insertion.
For some reason, GCC 4.2.1 ignores casts (for removing constness)
in calls to inline functions.
page_align(), ut_align_down(): Make the parameter const void*, but still
return a non-const pointer. This is ugly, but these functions cannot be
replaced with a const-preserving macro in a portable way, given that
the pointer argument is not always pointing to bytes.
buf_block_get_page_zip(): Implement as a const-preserving macro.
buf_frame_get_page_zip(), buf_block_align(): Add const qualifiers.
lock_rec_get_prev(): Silence GCC 4.2.1 warnings.
mlog_write_initial_log_record(), mlog_write_initial_log_record_fast(),
mtr_memo_contains(): Add const qualifier to the pointer.
page_header_get_ptr(): Rewrite as page_header_get_offs(), and
implement as a macro that calls this function.
offsets_[] arrays, as suggested by Vasil.
rec_offs_set_n_alloc(): Declare as a public function. Assert that
n_alloc > REC_OFFS_HEADER_SIZE.
rec_offs_get_n_alloc(): Assert that n_alloc > REC_OFFS_HEADER_SIZE.
mem_heap_zalloc() and mem_zalloc(), because calloc() in the C runtime
library takes two size parameters, not one.
mem_heap_zalloc(): Add debug assertions. Document that the return value
is never NULL.
Some things still fail in innodb-index.test, and there seems to be
a race condition (data dictionary lock wait) when running with --valgrind.
dfield_t: Add an "external storage" flag, dfield->ext.
dfield_is_null(), dfield_is_ext(), dfield_set_ext(), dfield_set_null():
New functions.
dfield_copy(), dfield_copy_data(): Add const qualifiers, fix in/out comments.
data_write_sql_null(): Use memset().
big_rec_field_t: Replace byte* data with const void* data.
ut_ulint_sort(): Remove.
upd_field_t: Remove extern_storage.
upd_node_t: Replace ext_vec, n_ext_vec with n_ext.
row_merge_copy_blobs(): New function.
row_ins_index_entry(): Add the parameter "ibool foreign" for suppressing
foreign key checks during fast index creation or when inserting into
secondary indexes.
btr_page_insert_fits(): Add const qualifiers.
btr_cur_add_ext(), upd_ext_vec_contains(): Remove.
dfield_print_also_hex(), dfield_print(): Replace if...else if with switch.
Observe dfield_is_ext().