of externally stored columns, and fix bugs introduced in r873. (Bug #22496)
btr_page_get_sure_split_rec(), btr_page_insert_fits(),
rec_get_converted_size(), rec_convert_dtuple_to_rec(),
rec_convert_dtuple_to_rec_old(), rec_convert_dtuple_to_rec_new():
Add parameters ext and n_ext. Flag external fields during the
conversion.
rec_set_field_extern_bits(), rec_set_field_extern_bits_new(),
rec_offs_set_nth_extern(), rec_set_nth_field_extern_bit_old():
Remove. The bits are set by rec_convert_dtuple_to_rec().
page_cur_insert_rec_low(): Remove the parameters ext and n_ext.
btr_cur_add_ext(): New utility function for updating and sorting ext[].
Low-level functions now expect the array to be in ascending order
for performance reasons. Used in btr_cur_optimistic_insert(),
btr_cur_pessimistic_insert(), and btr_cur_pessimistic_update().
btr_cur_optimistic_insert(): Remove some defensive code, because we cannot
compute the added parameters of rec_get_converted_size().
btr_push_update_extern_fields(): Sort the array. Require the array to
be twice the maximum usage, so that ut_ulint_sort() can be used.
dtuple_convert_big_rec(): Allocate new space for the BLOB pointer,
to avoid overwriting prefix indexes to the same column. Adapt
dtuple_convert_back_big_rec().
row_build_index_entry(): Fetch the columns also for prefix indexes of
the clustered index.
page_zip_apply_log(), page_zip_decompress_clust(): Allow externally
stored fields to lack a locally stored part.
with FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID and FIL_PAGE_DATA. The doublewrite
buffer needs to read the space_id in order to determine the type of the page.
Because FIL_PAGE_TYPE could contain garbage in MySQL/InnoDB 5.0 and earlier
versions, we cannot trust fil_page_get_type(). Instead, we have to always
store the space_id at the same location. This modification wastes 12 bytes
per compressed BLOB page (1.2% on 1-kilobyte pages).
stored column. This is the first part of fixing Bug #22496.
btr_copy_externally_stored_field_prefix(): New function.
btr_copy_externally_stored_field(): Split to
btr_copy_externally_stored_field_prefix_low().
row_sel_sec_rec_is_for_blob(): New function, used by
row_sel_sec_rec_is_for_clust_rec() in selects via
a secondary index.
btr_push_update_extern_fields(): Instead of iterating all upd_get_n_fields(),
stop at the first match.
row_search_index_entry(): Simplify the return statements.
row_upd_sec_step(): Eliminate the local variable "err".
row_upd_clust_step(): Add a UNIV_UNLIKELY hint.
guaranteed free space available for inserting one record.
btr_page_get_sure_split_rec(), btr_cur_pessimistic_insert():
Use page_zip_empty_size().
btr_page_split_and_insert(): Relax a debug assertion that there should
be at least two user records on the page. On compressed pages, we may
be able to write only one record.
record will fit or need external storage.
btr_page_get_sure_split_rec(): Estimate the free space of an empty
compressed page.
page_zip_rec_needs_ext(): New function, to replace existing tests whether
external storage is needed.
btr_rec_copy_externally_stored_field(): Add parameter zip_size.
Do not call buf_block_align(rec), because rec can also be in
dynamically allocated memory. buf_block_align() can only be invoked
on addresses inside the buffer pool.
page_zip_clear_rec(): Improve formatting.
btr_free_externally_stored_field(): Replace mlog_write_ulint()
with mach_write_to_4() when page_zip != NULL. The operation is
logged by page_zip_write_blob_ptr().
page_cur_delete_rec(): Do not call page_zip_validate() in the beginning,
because btr_set_min_rec_mark() in btr_cur_pessimistic_delete() will
cause a temporary mismatch.
Document temporary mismatches caused by btr_set_min_rec_mark() calls
and explain why they will not cause any problems.
os_aio_simulated_handle(): Temporarily disable os_file_check_page_trailers(),
which cannot be invoked on compressed pages.
dict_table_add_system_columns(): New function, split from
dict_table_add_to_cache().
mlog_parse_index(): Add system columns to the dummy table and identify
DB_TRX_ID and DB_ROLL_PTR in the dummy index.
buf_LRU_get_free_block(): Note that page_zip->data should be allocated from
an aligned memory pool.
buf_flush_buffered_writes(): Write compressed pages to disk.
buf_flush_post_to_doublewrite_buf(): Copy compressed pages to the
doublewrite buffer. Zero fill any excess space.
buf_flush_init_for_writing(): Treat all compressed pages the same.
buf_read_page_low(): Read compressed pages from disk.
buf_page_io_complete(): Process compressed pages.
trx_sys_doublewrite_init_or_restore_page(): Process compressed pages.
mlog_write_initial_log_record_fast(): Enable a debug printout
#ifdef UNIV_LOG_DEBUG.
fsp_header_init(), fsp_fill_free_list(): Pass the compressed page size
to buf_page_create().
page_zip_compress_write_log(): Flatten the if-else if-else logic.
page_zip_parse_write_blob_ptr(): Do not test page_zip if page==NULL.
page_zip_parse_write_node_ptr(): Do not test page_zip if page==NULL.
Invoke mlog_close() correctly.
row_sel_store_row_id_to_prebuilt(): Add UNIV_UNLIKELY hint to an
assertion-like test.
btr_cur_compress_if_useful(): Replace if () return(); return() with return.
page_rec_get_next_low(): New function.
page_rec_get_prev(): Invoke page_is_comp() outside the loop.
Replace page_rec_get_next() with loop-specific instances of
page_rec_get_next_low().
page_copy_rec_list_end(): Add some debug assertions.
Introduce FIL_PAGE_ZBLOB_DATA as a synonym for FIL_PAGE_FILE_FLUSH_LSN.
btr_store_big_rec_extern_fields(): Make the assertion about
dict_table_zip_size() more accurate.
buf_LRU_get_free_block(), buf_block_alloc(): Add parameter zip_size.
buf_calc_zblob_page_checksum(): Remove. Replace with page_zip_calc_checksum().
buf_page_init(): Remove parameter zip_size.
buf_page_io_complete(): Add a placeholder for handling compressed pages.
trx_doublewrite_page_inside(): Remove redundant function.
page_zip_write_rec(): Relax an overly tight assertion about blob_no.
btr_page_reorganize_low(): Rename new_page to temp_page.
btr_store_big_rec_extern_fields(): FIL_PAGE_TYPE is 2 bytes, not 4.
buf_page_init(), buf_page_create(), buf_read_page_low(),
buf_page_init_for_read(): Add parameter zip_size.
buf_page_init_for_backup_restore(),
recv_apply_log_recs_for_backup(): Enclose in #ifdef UNIV_HOTBACKUP.
Enclose some debug code in #ifdef UNIV_LOG_REPLICATE.
page_zip_write_header_log(): Replace page_zip with a pointer to
the uncompressed page.
page_zip_write_rec(): Relax an assertion about blob_no + n_ext.
page_copy_rec_list_to_created_page_write_log(): Allow logging to be disabled.
and to the file space header (FSP_PAGE_ZIP_SIZE, renamed from
FSP_LOWEST_NO_WRITE).
fil_space_struct: Add zip_size.
dict_table_struct: Embed zip_size in flags.
dict_table_zip_size(): Infer zip_size from table->flags.
dict_sys_tables_get_zip_size(): Read zip_size from SYS_TABLES.TYPE.
fil_space_get_zip_size(): Read zip_size from the file space header.
Add the redo log entry type MLOG_ZIP_FILE_CREATE.
page_zip_alloc(): Add parameter "mtr" and log successful calls
to page_zip_compress().
page_zip_write_blob_ptr(), page_zip_write_node_ptr(): Write the offset on
the uncompressed page, because mlog_write_initial_log_record_fast()
does not do so.
page_zip_write_header_log(), page_zip_parse_write_header(): Encode the
offset in one byte.
MLOG_ZIP_COMPRESS and MLOG_ZIP_DECOMPRESS with higher-level entry types.
Implement the logging and crash recovery of MLOG_ZIP_PAGE_CREATE.
page_create_zip(): New function for creating a compressed B-tree page.
page_parse_create_zip(): New function for applying a MLOG_ZIP_PAGE_CREATE
redo log record.
btr_page_create(): Remove the prototype. Add parameters page_zip, level,
prev, and next.
btr0btr.c: Eliminate page_zip_compress() calls where possible.
page_zip_alloc(), page_zip_compress(), page_zip_decompress(),
page_zip_clear_rec(): Remove parameter mtr.
recv_parse_or_apply_log_rec_body(): Handle MLOG_ZIP_PAGE_CREATE.
Add TODO comments for the other added redo log entry types.
dict_mem_table_create(): Account for DICT_TF_COMPRESSED in a debug assertion.
btr_store_big_rec_extern_fields(), btr_free_externally_stored_field(),
btr_copy_externally_stored_field(): Implement the disk format for
compressed BLOB pages.
btr_copy_externally_stored_field(): Improve error reporting and handling
when decompressing BLOB pages.
buf_flush_init_for_writing(), buf_page_is_corrupted(), buf_page_print():
Account for compressed BLOB pages (FIL_PAGE_TYPE_ZBLOB).
buf_calc_zblob_page_checksum(): New function.
Replace btr_page_get_level() with page_is_leaf() where possible.
row_purge_upd_exist_or_extern(): Remove obsolete TODO comment.
dtuple_convert_big_rec(): Replace a flag variable with goto.
btr_store_big_rec_extern_fields(): Assert that page_zip is non-NULL
if and only if dict_table_is_zip() holds.
btr_free_externally_stored_field(): Observe dict_table_is_zip().
Allow page_zip==NULL even if dict_table_is_zip(). Remove the
related TODO comment in row_purge_upd_exist_or_extern().
page_zip_available(): uncompressed_size already includes
PAGE_ZIP_DIR_SLOT_SIZE.
page_zip_decompress(): Remove bogus assertion d_stream.next_out == last.
Do not subtract BTR_EXTERN_FIELD_REF_SIZE from d_stream.avail_in when
decompressing records, because the records may be deleted later in
page_zip_apply_log(), and no BLOB pointers are allocated for deleted
records.
btr_page_split_and_insert(): Avoid dereferencing pointers to garbage on
the old page.
btr_cur_pessimistic_insert(): Pass pointer to big_rec_vec to
btr_cur_optimistic_insert().
trx_undo_prev_version_build(): Only invoke rec_set_field_extern_bits()
if n_ext_vect > 0.
row_ins_index_entry_low(): Simplify a debug assertion.
page_copy_rec_list_end_no_locks(): Make the loop slightly more readable.
page_delete_rec_list_end(): Delete records on compressed pages one by one.
There are still some bugs in the code.
btr_store_big_rec_extern_fields(): Remove assertion on dict_table_is_zip()
to ease testing.
btr_free_externally_stored_field(): Test page_zip instead of
dict_table_is_zip().
page_zip_write_rec(): Add parameter "create". Try to handle externally
stored columns.
rec_offs_any_extern(): Correct the function comment.
Add rec_offs_n_extern() and page_zip_get_n_prev_extern().
page_zip_dir_decode(): Replace assertion with if (...) return(FALSE).
page_zip_decompress(): Do not clear page_zip->n_blobs after counting the
BLOBs.
page_zip_write_blob_ptr(): Use page_zip_get_n_prev_extern().
Correct an off-by-one error in memcpy().
btr_cur_pessimistic_update(): Remove extraneous page_zip_write_rec() call.
btr_cur_set_ownership_of_extern_field(): Simplify the logic.
row_upd_rec_in_place(): Make use of parameter "index" in debug assertions.
page_zip_write_rec(): Remove TODO comment about redo log record.
The write will already be covered by higher-level log entries.
of clustered indexes. Previously, parts of the code assumed that these
columns would exist on all leaf pages. Simplify the update-in-place of
these columns.
Add inline function dict_index_is_clust() to replace all tests
index->type & DICT_CLUSTERED.
Remove the redo log entry types MLOG_ZIP_WRITE_TRX_ID and
MLOG_ZIP_WRITE_ROLL_PTR, because the modifications to these columns
are covered by logical logging.
Fuse page_zip_write_trx_id() and page_zip_write_roll_ptr() into
page_zip_write_trx_id_and_roll_ptr().
page_zip_dir_add_slot(), page_zip_available(): Add flag "is_clustered",
so that no space will be reserved for TRX_ID and ROLL_PTR on leaf pages
of secondary indexes.
page_zip_apply_log(): Flag an error when val==0 is encoded with two bytes.
page_zip_write_rec(): Add debug assertions that there is enough space
available for the entry before copying the data bytes of the record.
row_upd_rec_in_place(), page_zip_write_rec(): Add parameter "index".
page_dir_set_n_heap(): Add a debug assertion that on compressed
pages, n_heap will always be incremented by one. Improve code formatting.
page_zip_dir_add_slot(): New function, called from
page_cur_insert_rec_low() after page_mem_alloc_heap().
rec_set_n_owned_new(): Do not call page_zip_rec_set_owned()
on the supremum record.
rec_offs_make_valid(): Add debug assertions.
page_zip_dir_user_size(): Correct an off-by-one error in the debug assertion.
page_zip_apply_log(): Add parameter trx_id_col. Skip trx_id and roll_ptr.
page_zip_decompress(): Simplify the handling of "storage" in the loop that
copies the uncompressed fields.
page_zip_write_rec(): Store trx_id and roll_ptr separately.
page_zip_write_trx_id(), page_zip_write_roll_ptr(): Fix off-by-one errors.
page_cur_insert_rec_low(): Call page_zip_dir_add_slot() after
page_mem_alloc_heap(). Remove some redundant assertions.
Pass page_zip to page_dir_split_slot().
btr_create(): page_zip_compress() returns FALSE on failure.
page_zip_write_header(): Write to page_zip->data[] instead of page_zip[].
buf_flush_init_for_writing(): Add parameter page_zip and set the fields
also in the header of the compressed page.
btr_cur_search_to_nth_level(): Add ut_ad() on page_zip_validate().