This patch tries to enable resizeable buffer pool by polling the configuration
parameter for the buffer pool size, which is not a good solution. The
right way could be to have an update function callback of a settable
MySQL variable to send an event to the master thread.
It could also make sense to expose the buffer pool chunks to the user.
The first chunk would never be freed. Other chunks than the first one
would only be used for allocating page frames (uncompressed or compressed)
and block descriptors of of compressed pages (buf_page_t). That is, other
users of the buffer pool, such as mem_heap_create_block() and the lock
table, would be restricted to the first chunk. This would allow other
chunks to be freed by simply flushing any dirty blocks that they contain.
It might also be worthwhile to create multiple chunks initially, based on
the initial buffer pool size and the HugeTLB page size. In that way, the
buffer pool could be reduced from the initial configuration at runtime.
single-table tablespaces. This bug was reported by Sunny as Mantis issue #26.
fil_space_create(), fil_create_new_single_table_tablespace(),
fil_open_single_table_tablespace(), fsp_header_init_fields():
Add ut_a(flags != DICT_TF_COMPACT).
dict_build_table_def_step(), row_import_tablespace_for_mysql(),
row_truncate_table_for_mysql(): Pass correct flags to
fil_create_new_single_table_tablespace() or fil_open_single_table_tablespace().
my_error(ER_TOO_BIG_ROWSIZE, ...). Otherwise, MySQL can report
"Got error 139 from storage engine" instead of the appropriate
error message.
ha_innobase::index_read(), ha_innobase::general_fetch():
Replace if-else if-else with switch-case.
Pass table->flags to convert_error_code_to_mysql().
innodb_check_for_record_too_big_error(). Remove. This code belongs to
convert_error_code_to_mysql().
convert_error_code_to_mysql(): Add the parameter "flags", for table flags.
Translate DB_TOO_BIG_RECORD into ER_TOO_BIG_ROWSIZE.
create_index(): Add the parameter "flags".
create_clustered_index_when_no_primary(): Replace the parameter "comp"
with "flags".
innobase_drop_database(): Remove the #ifdef'd-out call to
convert_error_code_to_mysql().
Throw warnings, not errors for wrong ROW_FORMAT or KEY_BLOCK_SIZE,
so that any table dump can be loaded.
As of this change, InnoDB supports the following table formats:
ROW_FORMAT=REDUNDANT
the only format before MySQL/InnoDB 5.0.3
ROW_FORMAT=COMPACT
the new default format of MySQL/InnoDB 5.0.3
ROW_FORMAT=DYNAMIC
uncompressed, no prefix in the clustered index record for BLOBs
ROW_FORMAT=COMPRESSED
like ROW_FORMAT=DYNAMIC, but zlib compressed B-trees and BLOBs;
the compressed page size is specified by KEY_BLOCK_SIZE in
kilobytes (1, 2, 4, 8, or 16; default 8)
KEY_BLOCK_SIZE=1, 2, 4, 8, or 16: implies ROW_FORMAT=COMPRESSED;
ignored if ROW_FORMAT is not COMPRESSED
KEY_BLOCK_SIZE=anything else: ignored
The InnoDB row format is displayed in the 4th column (Row_format) of
the output of SHOW TABLE STATUS. The Create_options column may show
ROW_FORMAT= and KEY_BLOCK_SIZE=, but they do not necessarily have
anything to do with InnoDB.
The table format can also be queried like this:
SELECT table_schema, table_name, row_format
FROM information_schema.tables
WHERE engine='innodb' and row_format in ('Compressed','Dynamic');
When Row_format='Compressed', KEY_BLOCK_SIZE should usually correspond
to the compressed page size. But the .frm file could be manipulated
to show any KEY_BLOCK_SIZE.
For some reason, INFORMATION_SCHEMA.TABLES.CREATE_OPTIONS does not
include KEY_BLOCK_SIZE. It does include row_format (spelled in
lowercase). This looks like a MySQL bug, because the table
INFORMATION_SCHEMA.TABLES probably tries to replace SHOW TABLE STATUS.
I reported this as Bug #35275 <http://bugs.mysql.com/35275>.
ha_innobase::get_row_type(): Add ROW_TYPE_COMPRESSED, ROW_TYPE_DYNAMIC.
ha_innobase::create(): Implement ROW_FORMAT=COMPRESSED and
ROW_FORMAT=DYNAMIC. Do not throw errors for wrong ROW_FORMAT or
KEY_BLOCK_SIZE, but issue warnings instead.
ha_innobase::check_if_incompatible_data(): Return COMPATIBLE_DATA_NO
if KEY_BLOCK_SIZE has been specified.
innodb.result: Adjust the result for the warning issued for ROW_FORMAT=FIXED.
innodb-zip.test: Add tests. Query INFORMATION_SCHEMA.TABLES for ROW_FORMAT.
externally stored columns.
innodb-zip.test: Correct the test case. Without the fixes, the test
would fail, because the BLOB would be prepended with a 768-byte prefix
of the data.
row_upd_index_replace_new_col_vals_index_pos(),
row_upd_index_replace_new_col_vals(): Use only one "heap"
parameter that must be non-NULL. When fetching externally
stored columns, use upd_field_t::orig_len.
upd_get_field_by_field_no(): New accessor function, for retrieving
an field from an update vector by field_no.
row_upd_index_replace_new_col_val(): New function, for replacing the
value from an update vector. This used to be duplicated code in
row_upd_index_replace_new_col_vals_index_pos() and
row_upd_index_replace_new_col_vals().
variable innodb_file_format. Implement file format version stamping of
*.ibd files and SYS_TABLES.TYPE.
This change breaks introduces an incompatible change for for
compressed tables. We can do this, as we have not released yet.
innodb-zip.test: Add tests for stricter KEY_BLOCK_SIZE and ROW_FORMAT
checks.
DICT_TF_COMPRESSED_MASK, DICT_TF_COMPRESSED_SHIFT: Replace with
DICT_TF_ZSSIZE_MASK, DICT_TF_ZSSIZE_SHIFT.
DICT_TF_FORMAT_MASK, DICT_TF_FORMAT_SHIFT, DICT_TF_FORMAT_51,
DICT_TF_FORMAT_ZIP: File format version, stored in table->flags,
in the .ibd file header, and in SYS_TABLES.TYPE.
dict_create_sys_tables_tuple(): Write the table flags to SYS_TABLES.TYPE
if the format is at least DICT_TF_FORMAT_ZIP. For old formats
(DICT_TF_FORMAT_51), write DICT_TABLE_ORDINARY as the table type.
DB_TABLE_ZIP_NO_IBD: Remove the error code. The error handling is done
in ha_innodb.cc; as a failsafe measure, dict_build_table_def_step() will
silently clear the compression and format flags instead of returning this
error.
dict_mem_table_create(): Assert that no extra bits are set in the flags.
dict_sys_tables_get_zip_size(): Rename to dict_sys_tables_get_flags().
Check all flag bits, and return ULINT_UNDEFINED if the combination is
unsupported.
dict_boot(): Document the SYS_TABLES columns N_COLS and TYPE.
dict_table_get_format(), dict_table_set_format(),
dict_table_flags_to_zip_size(): New accessors to table->flags.
dtuple_convert_big_rec(): Introduce the auxiliary variables
local_len, local_prefix_len. Store a 768-byte prefix locally
if the file format is less than DICT_TF_FORMAT_ZIP.
dtuple_convert_back_big_rec(): Restore the columns.
srv_file_format: New variable: innodb_file_format.
fil_create_new_single_table_tablespace(): Replace the parameter zip_size
with table->flags.
fil_open_single_table_tablespace(): Replace the parameter zip_size_in_k
with table->flags. Check the flags.
fil_space_struct, fil_space_create(), fil_op_write_log():
Replace zip_size with flags.
fil_node_open_file(): Note a TODO item for InnoDB Hot Backup.
Check that the tablespace flags match.
fil_space_get_zip_size(): Rename to fil_space_get_flags(). Add a
wrapper for fil_space_get_zip_size().
fsp_header_get_flags(): New function.
fsp_header_init_fields(): Replace zip_size with flags.
FSP_SPACE_FLAGS: New name for the tablespace flags. This field used
to be called FSP_PAGE_ZIP_SIZE, or FSP_LOWEST_NO_WRITE. It has always
been written as 0 in MySQL/InnoDB versions 4.1 to 5.1.
MLOG_ZIP_FILE_CREATE: Rename to MLOG_FILE_CREATE2. Add a 32-bit
parameter for the tablespace flags.
ha_innobase::create(): Check the table attributes ROW_FORMAT and
KEY_BLOCK_SIZE. Issue errors if they are inappropriate, or warnings
if the inherited attributes (in ALTER TABLE) will be ignored.
PAGE_ZIP_MIN_SIZE_SHIFT: New constant: the 2-logarithm of PAGE_ZIP_MIN_SIZE.
There is one consideration: fil_init() chooses the tablespace hash size
based on the initial value of srv_file_per_table. However, this is nothing
new: InnoDB could be started with innodb_file_per_table=0 even though
*.ibd files exist.
srv_file_per_table: Declare as my_bool instead of ibool, because
MYSQL_SYSVAR_BOOL() expects a pointer to my_bool. Document the
variable also in srv0srv.h.
innobase_start_or_create_for_mysql(): Note why it is OK to temporarily
clear srv_file_per_table.
innobase_file_per_table: Remove.
(Warning C4090 is incorrectly issued when using Visual C++ .NET 2003,
bug 101661, http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661)
dict_table_find_equivalent_index(): Cast away constness in the mem_free()
call. MSVC seems to think that an array of pointers to const data is
const itself.
UT_SORT_FUNCTION_BODY(): Cast away constness in the memcpy() call.
MSVC seems to think that an array of pointers to const data is const itself.
as type-independent macros instead of functions. Because ut_2pow_round()
and ut_2pow_remainder() no longer assert ut_is_2pow(m), add the assertions
to callers when needed. Also add parentheses to assist the compiler in
common subexpression elimination.
Compute BUF_READ_AHEAD_RANDOM_AREA and BUF_READ_AHEAD_LINEAR_AREA
only once. The definition of BUF_READ_AHEAD_AREA depends on
buf_pool->curr_size, which could change while this code is running.
lock_get_table(), locks_row_eq_lock(), buf_page_get_mutex(): Add return
after ut_error. On Windows, ut_error is not declared as "noreturn".
Add explicit type casts when assigning ulint to byte to get rid of
"possible loss of precision" warnings.
struct i_s_table_cache_struct: Declare rows_used, rows_allocd as ulint
instead of ullint. 32 bits should be enough.
fill_innodb_trx_from_cache(), i_s_zip_fill_low(): Cast 64-bit unsigned
integers to longlong when calling Field::store(longlong, bool is_unsigned).
Otherwise, the compiler would implicitly convert them to double and
invoke Field::store(double) instead.
recv_truncate_group(), recv_copy_group(), recv_calc_lsn_on_data_add():
Cast ib_uint64_t expressions to ulint to get rid of "possible loss of
precision" warnings. (There should not be any loss of precision in
these cases.)
log_close(), log_checkpoint_margin(): Declare some variables as ib_uint64_t
instead of ulint, so that there won't be any potential loss of precision.
mach_write_ull(): Cast the second argument of mach_write_to_4() to ulint.
OS_FILE_FROM_FD(): Cast the return value of _get_osfhandle() to HANDLE.
row_merge_dict_table_get_index(): Cast the parameter of mem_free() to (void*)
in order to get rid of the bogus MSVC warning C4090, which has been reported
as MSVC bug 101661:
<http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101661>
row_mysql_read_blob_ref(): To get rid of a bogus MSVC warning C4090,
drop a const qualifier.
blocks that contains uncompressed and compressed frames. This patch was
designed by Heikki and Inaam, implemented by Inaam, and refined and reviewed
by Marko and Sunny.
buf_buddy_n_frames, buf_buddy_min_n_frames, buf_buddy_max_n_frames: Remove.
buf_page_belongs_to_unzip_LRU(): New predicate:
bpage->zip.data && buf_page_get_state(bpage) == BUF_BLOCK_FILE_PAGE.
buf_pool_t, buf_block_t: Add the linked list unzip_LRU. A block in the
regular LRU list is in unzip_LRU iff buf_page_belongs_to_unzip_LRU() holds.
buf_LRU_free_block(): Add a third return value to refine the case
"cannot free the block".
buf_LRU_search_and_free_block(): Update the documentation to reflect the
implementation.
buf_LRU_stat_t, buf_LRU_stat_cur, buf_LRU_stat_sum, buf_LRU_stat_arr[]:
Statistics for the unzip_LRU algorithm.
buf_LRU_stat_update(): New function: Update the statistics. Called once
per second by srv_error_monitor_thread().
buf_LRU_validate(): Validate the unzip_LRU list as well.
buf_LRU_evict_from_unzip_LRU(): New predicate: Use the unzip_LRU before
falling back to the regular LRU?
buf_LRU_free_from_unzip_LRU_list(), buf_LRU_free_from_common_LRU_list():
Subfunctions of buf_LRU_search_and_free_block().
buf_LRU_search_and_free_block(): Reimplement. Try to evict an uncompressed
page from the unzip_LRU list before falling back to evicting an entire block
from the common LRU list.
buf_unzip_LRU_remove_block_if_needed(): New function.
buf_unzip_LRU_add_block(): New function: Add a block to the unzip_LRU list.
innobase_raw_format(), move the definition from row0row.c to
ha_innodb.cc. After this change, row0row.c no longer references
system_charset_info (Mantis issue #17). Patch prepared by Vasil,
tested by Calvin, and reviewed by Marko.
Add CMake-generated files and directories to svn:ignore. This patch
is from Calvin Sun, who couldn't commit it properly on Windows.
Do "svn propset svn:eol-style native" on every text file, to fix
line format problems on Windows.
also when row_merge_create_temporary_table() fails. Otherwise, an
assertion would fail when the client connection is closed, because
prebuilt->trx would still be holding a table lock on innodb_table.
Use innobase_strcasecmp() insteaed of strcasecmp() in i_s.cc and get rid
of strings.h (that file is not present on Windows).
Move the prototype of innobase_strcasecmp() from ha_innodb.cc and
dict0dict.c to ha_prototypes.h.
Approved by: Heikki
buf_buddy_relocated_duration[],
page_zip_compress_duration[]
page_zip_decompress_duration[]: Record the total duration of the operations.
buf_buddy_relocate(), page_zip_compress(), page_zip_decompress():
Add ut_time_us() instrumentation.
i_s_zip_fields_info[], i_s_zip_fill_low(): Move the columns containing
cumulated statistics last. Add relocated_usec, compressed_usec, and
decompressed_usec.
for the purpose of comparing different compression algorithms.
PAGE_ZIP_COMPRESS_DBG: New preprocessor condition, to see if deflate()
is wrapped.
page_zip_compress_log: Log file counter. If set to nonzero, logging
is enabled.
page_zip_compress_deflate(): Add the parameter logfile.
FILE_LOGFILE, LOGFILE: Macros for declaring and passing the parameter logfile.
page_zip_compress(): Open and close the logfile if needed. Write the
uncompressed page and the size of the compressed data. The data passed
to deflate() is written by the wrapper page_zip_compress_deflate().