column of a compressed table, the BTR_EXTERN_LEN field in the BLOB pointer
will be written as 0. Tolerate this in the functions that deal with
externally stored columns. This fixes Issue #80 and was posted at rb://26.
Note that the clustered index record is always deleted or purged last,
after any secondary index records referring to it have been deleted.
btr_free_externally_stored_field(): On an uncompressed table, zero out
the BTR_EXTERN_LEN, so that half-deleted BLOBs can be detected after
crash recovery.
btr_copy_externally_stored_field_prefix(): Return 0 if the BLOB has been
half-deleted.
row_upd_ext_fetch(): Assert that the externally stored column exists.
row_ext_cache_fill(): Allow btr_copy_externally_stored_field_prefix()
to return 0.
row_sel_sec_rec_is_for_blob(): Return FALSE if the BLOB has been half-deleted.
This is correct, because the clustered index record would have been deleted
or purged last, after any secondary index records referring to it had been
deleted.
recovery, tolerate clustered index records whose externally stored
columns have not been written. This should remove the assertion failures
that were reported as Mantis issue#58, issue#62, issue#64.
trx_is_recv(): New function: TRUE if this transaction is rolling back
an incomplete transaction in crash recovery.
enum trx_rbmode: Rollback modes: no rollback, normal rollback, crash recovery.
btr_cur_pessimistic_delete(), btr_free_externally_stored_field(),
btr_rec_free_externally_stored_fields():
Replace the ibool parameter with enum trx_rbmode.
btr_free_externally_stored_field(): If field_ref is zero, return
but assert ut_a(rbmode == RB_RECOVERY). Unless InnoDB has crashed
while inserting a clustered index record, field_ref should not be zero.
btr_rec_free_updated_extern_fields(): Add the parameter enum trx_rbmode.
btr_cur_pessimistic_update(): Pass the rbmode parameter to
btr_rec_free_updated_extern_fields().
row_undo_ins(), row_undo_mod_upd_del_sec(): If row_build_index_entry()
fails, assert trx_is_recv() and skip this secondary index.
row_undo_mod_upd_del_sec(): Empty the heap at the end of each loop
iteration in order to conserve memory and to reduce the number of
low-level memory allocations.
in *.h files, so that the function signatures in the *.h and *.c files fully
match each other.
ut_dulint_sort(): Add a UNIV_INTERN qualifier also to the function definition.
to the undo log, also store the original length of the column, so that the
changes will be correctly undone in transaction rollback or when fetching
previous versions of the row.
innodb-zip.test: New file, for tests of the compression.
upd_field_t: Add orig_len, the original length of new_val.
btr_push_update_extern_fields(): Restore the original prefix of the column.
Add the parameter heap where memory will be allocated if necessary.
trx_undo_rec_get_col_val(): Add the output parameter orig_len.
trx_undo_page_report_modify_ext(): New function: Write an externally
stored column to the undo log. This is only called from
trx_undo_page_report_modify(), and this is the only caller of
trx_undo_page_fetch_ext().
trx_undo_update_rec_get_update(): Read the original length of the column
prefix to upd_field->orig_len.
stored columns (BLOBs).
btr_copy_blob_prefix(), btr_copy_zblob_prefix(),
btr_copy_externally_stored_field_prefix_low(),
btr_copy_externally_stored_field_prefix(),
btr_copy_externally_stored_field(),
btr_rec_copy_externally_stored_field():
Note that the page containing the clustered index record that points to
the BLOB must be latched.
btr_copy_zblob_prefix(): Note that there is no latch on the page, and thus
all accesses to a given page via this function must be covered by the same
set of locks or latches.
btr_copy_zblob_prefix(): Note that the block acquired by
buf_page_get_zip() is protected by an exclusive table lock or
or by a latch on the clustered index record.
Only add indexed BLOBs to row_ext.
trx_undo_rec_get_partial_row(): Move the BLOB fetching to row_ext_create().
row_build(): Pass only those BLOBs to row_ext_create() that are referenced by
ordering columns of some indexes, similar to trx_undo_rec_get_partial_row().
row_ext_create(): Add the parameter "tuple". Move the implementation
from row0ext.ic to row0ext.c.
row_ext_lookup_ith(), row_ext_lookup(): Return a const pointer. Remove
the parameters "field" and "f_len". Make the row_ext_t* parameter const.
row_ext_t: Remove the field zip_size.
field_ref_zero[]: Declare in btr0types.h instead of btr0cur.h.
row_ext_lookup_low(): Rename to row_ext_cache_fill() and change the
signature.
univ.i: Do not define UNIV_DEBUG, UNIV_ZIP_DEBUG.
btr_cur_del_unmark_for_ibuf(): Use the same comment in both btr0cur.c and
btr0cur.h. Wrap long lines.
contents end up with conflicting versions of a record's state. The zipped
page record was not being marked as "(un)deleted" because we were not
passing the zipped page contents to the (un)delete function, which first
(un)delete marks the uncompressed version and then based on whether
page_zip is NULL or not (un)delete marks the record in the compressed page.
inserted, uncommitted clustered index records when determining if a
secondary index record that contains a column prefix of an externally
stored column is referencing the clustered index record.
field_ref_zero[]: A BLOB pointer full of zero, for use in comparisons.
btr_copy_externally_stored_field_prefix(): Assert that the BLOB pointer is set.
row_ext_lookup_ith(), row_ext_lookup(), row_ext_lookup_low(): Document
that field_ref_zero is returned when the BLOB cannot be fetched.
row_ext_lookup_low(): Return field_ref_zero and *len = 0 when the
BLOB pointer is unset.
row_build_index_entry(): Return NULL when a needed BLOB pointer cannot
be dereferenced (row_ext_lookup returns field_ref_zero). Check the
return value for NULL in callers.
row_vers_impl_x_locked_off_kernel(): Avoid comparisons when
row_build_index_entry() returns NULL.
row_vers_old_has_index_entry(): Ignore records for which
row_build_index_entry() returns NULL. The entry should never be NULL
in rollback, but it may be NULL in purge.
row_merge_buf_add(): Assert that row_ext_lookup() does not return
field_ref_zero. The table will be locked during index creation.
fix the bugs introduced in r1591.
row_rec_to_index_entry_low(): Clear "n_ext". Do not allow it to be NULL.
Add const qualifier to dict_index_t*.
row_rec_to_index_entry(): Add the parameters "offsets" and "n_ext".
btr_cur_optimistic_update(): Add an assertion that there are no externally
stored columns. Remove the unreachable call to btr_cur_unmark_extern_fields()
and the preceding unnecessary call to rec_get_offsets().
btr_push_update_extern_fields(): Remove the parameters index, offsets.
Only report the additional externally stored columns of the update vector.
row_build(), trx_undo_rec_get_partial_row(): Flag externally stored columns
also with dfield_set_ext().
rec_copy_prefix_to_dtuple(): Assert that there are no externally stored
columns in the prefix.
row_build_row_ref(): Note and assert that the index is a secondary index,
and assert that there are no externally stored columns.
row_build_row_ref_fast(): Assert that there are no externally stored columns.
rec_offs_get_n_alloc(): Expose the function.
row_build_row_ref_in_tuple(): Assert that there are no externally stored
columns in a record of a secondary index.
row_build_row_ref_from_row(): Assert that there are no externally stored
columns.
row_upd_check_references_constraints(): Add the parameter offsets, to
avoid a redundant call to rec_get_offsets().
row_upd_del_mark_clust_rec(): Add the parameter offsets. Remove
duplicated code.
row_ins_index_entry_set_vals(): Copy the external storage flag.
sel_pop_prefetched_row(): Assert that there are no externally stored
columns.
row_scan_and_check_index(): Copy offsets to a temporary heap across
the invocation of row_rec_to_index_entry().
and use it for flagging externally stored columns in the data tuple.
The data tuple contains the same columns as the clustered index record,
but in a different order. This error was introduced in r1591.
TODO: the assertion ut_ad(!dfield_is_ext()) may fail in
btr_cur_pessimistic_update().
Some things still fail in innodb-index.test, and there seems to be
a race condition (data dictionary lock wait) when running with --valgrind.
dfield_t: Add an "external storage" flag, dfield->ext.
dfield_is_null(), dfield_is_ext(), dfield_set_ext(), dfield_set_null():
New functions.
dfield_copy(), dfield_copy_data(): Add const qualifiers, fix in/out comments.
data_write_sql_null(): Use memset().
big_rec_field_t: Replace byte* data with const void* data.
ut_ulint_sort(): Remove.
upd_field_t: Remove extern_storage.
upd_node_t: Replace ext_vec, n_ext_vec with n_ext.
row_merge_copy_blobs(): New function.
row_ins_index_entry(): Add the parameter "ibool foreign" for suppressing
foreign key checks during fast index creation or when inserting into
secondary indexes.
btr_page_insert_fits(): Add const qualifiers.
btr_cur_add_ext(), upd_ext_vec_contains(): Remove.
dfield_print_also_hex(), dfield_print(): Replace if...else if with switch.
Observe dfield_is_ext().
to rec_t*. Remove the ut_ad(rec_offs_validate()), because this function
will be called from row0merge.c on a record that lacks the
REC_N_NEW_EXTRA_BYTES.
exactly. Rename the local variable "ulint level" to "ibool leaf".
Document that if the function returns DB_SUCCESS on a compressed page that
is covered by the insert buffer, the mini-transaction must be committed
before latching any further pages. Verify that this is the case on all
execution paths.
the insert buffer bitmap.
ibuf_set_free_bits_func(): Never disable redo logging.
ibuf_update_free_bits_zip(): Remove.
btr_page_reorganize_low(), page_zip_reorganize(): Do not update the insert
buffer bitmap. Instead, document that callers will have to take care of it,
and adapt the callers.
btr_compress(): On error, reset the insert buffer free bits.
btr_cur_insert_if_possible(): Do not modify the insert buffer bitmap.
btr_compress(), btr_cur_optimistic_insert(): On compressed pages,
reset the insert buffer bitmap. Document why.
btr_cur_update_alloc_zip(): Document why it is necessary and sufficient
to reset the insert buffer free bits.
btr_cur_update_in_place(), btr_cur_optimistic_update(),
btr_cur_pessimistic_update(): Update the free bits in the same
mini-transaction. Document that the mini-transaction must be
committed before latching any further pages. Verify that this
is the case in all execution paths.
row_ins_sec_index_entry_by_modify(), row_ins_clust_index_entry_by_modify(),
row_undo_mod_clust_low(): Because these functions call
btr_cur_update_in_place(), btr_cur_optimistic_update(), or
btr_cur_pessimistic_update(), document that the mini-transaction must be
committed before latching any further pages. Verify that this is the case
in all execution paths.
Previously, when big_rec was returned, the fields would point to
freed memory. The memory heap was allocated locally, and the data tuple
was allocated from the heap, and the big_rec would point to some fields
in the data tuple.
row_ins_clust_index_entry_by_modify(): Add parameter heap,
for the same reason.
btr_check_node_ptr(): Replace page_t* parameter with buf_block_t*.
btr_free_externally_stored_field(): Add const qualifier to rec.
Remove an explicit buf_block_align() call, but replace an
mtr_memo_contains() with mtr_memo_contains_page().
row_upd_rec_sys_fields(): Reorder an assertion containing buf_block_align()
so that the costly call can be avoided in some cases.
from the adaptive hash index [btr_search_guess_on_hash() and
btr_search_validate()]. Some references to buf_block_align() remain
in debug builds.
btr_store_big_rec_extern_fields(): Add the parameter rec_block.
page_rec_get_next_low(): Do not assume that the page has been
allocated from the buffer pool when printing the diagnostic information.
page_cur_insert_rec_low(): Replace the parameter page_zip_des_t* page_zip
with the parameter buf_block_t* block.
btr_cur_t: Move page_block to page_cur_t::block.
page_cur_get_block(), page_cur_get_page_zip(): New functions.
page_cur_position(): Add parameter block.
Remove many page_zip parameters, now that there is page_cur_get_page_zip().
Replace some page, page_zip parameters with block.
Add some const qualifiers to function parameters and remove casts.
PAGE_HEAP_NO_INFIMUM, PAGE_HEAP_NO_SUPREMUM, PAGE_HEAP_NO_USER_LOW:
New constants.
Replace some cursor code in low-level diagnostic functions with
direct management of rec, because buf_block_t::buf_fix_count may be 0
when the functions are called, and debug assertions would fail.
of externally stored columns, and fix bugs introduced in r873. (Bug #22496)
btr_page_get_sure_split_rec(), btr_page_insert_fits(),
rec_get_converted_size(), rec_convert_dtuple_to_rec(),
rec_convert_dtuple_to_rec_old(), rec_convert_dtuple_to_rec_new():
Add parameters ext and n_ext. Flag external fields during the
conversion.
rec_set_field_extern_bits(), rec_set_field_extern_bits_new(),
rec_offs_set_nth_extern(), rec_set_nth_field_extern_bit_old():
Remove. The bits are set by rec_convert_dtuple_to_rec().
page_cur_insert_rec_low(): Remove the parameters ext and n_ext.
btr_cur_add_ext(): New utility function for updating and sorting ext[].
Low-level functions now expect the array to be in ascending order
for performance reasons. Used in btr_cur_optimistic_insert(),
btr_cur_pessimistic_insert(), and btr_cur_pessimistic_update().
btr_cur_optimistic_insert(): Remove some defensive code, because we cannot
compute the added parameters of rec_get_converted_size().
btr_push_update_extern_fields(): Sort the array. Require the array to
be twice the maximum usage, so that ut_ulint_sort() can be used.
dtuple_convert_big_rec(): Allocate new space for the BLOB pointer,
to avoid overwriting prefix indexes to the same column. Adapt
dtuple_convert_back_big_rec().
row_build_index_entry(): Fetch the columns also for prefix indexes of
the clustered index.
page_zip_apply_log(), page_zip_decompress_clust(): Allow externally
stored fields to lack a locally stored part.
stored column. This is the first part of fixing Bug #22496.
btr_copy_externally_stored_field_prefix(): New function.
btr_copy_externally_stored_field(): Split to
btr_copy_externally_stored_field_prefix_low().
row_sel_sec_rec_is_for_blob(): New function, used by
row_sel_sec_rec_is_for_clust_rec() in selects via
a secondary index.
btr_rec_copy_externally_stored_field(): Add parameter zip_size.
Do not call buf_block_align(rec), because rec can also be in
dynamically allocated memory. buf_block_align() can only be invoked
on addresses inside the buffer pool.
page_zip_clear_rec(): Improve formatting.
Fix the way how btr_free_externally_stored_field() is called in purge.
btr_free_externally_stored_field(): Add parameter field_ref that points
directly to the BLOB reference. Use rec, offsets, page_zip, and i
only for the page_zip_write_blob_ptr() call.
row_purge_upd_exist_or_extern(): Do not assume that the undo log contains
the entire record. Only pass the BLOB reference to
btr_free_externally_stored_field().
This has not been extensively tested yet, because some other part of the
code breaks in "ibtestblob".
btr_free_page_low(): Add parameters "space" and "page_no", because they
are omitted from compressed BLOB pages.
btr0cur.c: Implement the compression and decompression of BLOB columns,
enabled at compile-time (#define ZIP_BLOB TRUE) for now.
btr_rec_free_externally_stored_fields(),
btr_copy_externally_stored_field(): Made static
mlog_log_string(): New function, split from mlog_write_string(), allows
to avoid a dummy memcpy() of compressed BLOB pages.
flag of records. The flags may only be updated in heap-allocated
copies of records.
btr_root_raise_and_insert(),
btr_page_split_and_insert(),
btr_cur_insert_if_possible(),
btr_cur_optimistic_insert(),
btr_cur_pessimistic_insert(),
page_cur_tuple_insert(),
page_cur_insert_rec_low(): Add parameters "ext" and "n_ext".
dtuple_convert_big_rec(): Make parameter "ext" const.
BLOB pointers, trx_id, and roll_ptr.
btr_empty(), btr_create(), page_create(): Add parameter "index", as some
index information will be encoded on the compressed page.
Define REC_NODE_PTR_SIZE as 4.
Allow btr_page_reorganize() and btr_page_reorganize_low() to fail.
Define the error code DB_ZIP_OVERFLOW.
Make row_ins_index_entry_low() static.
page0zip: Encode the index, log reorganized records, and store uncompressed
fields separately from the compressed data stream.