Remove most references to thread id in InnoDB. Three references
remain: the current holder of a mutex, and the current x-lock holder
of a rw-lock, and some references in UNIV_SYNC_DEBUG checks. This
allows MySQL to change the thread associated to a client connection.
Tighten the UNIV_SYNC_DEBUG checks, trying to ensure that no InnoDB
mutex or x-lock is being held when returning control to MySQL. The
only semaphore that may be held is the btr_search_latch in shared mode.
sync_thread_levels_empty_except_dict(): A wrapper for
sync_thread_levels_empty_gen(TRUE).
sync_thread_levels_nonempty_trx(): Check that the current thread is
not holding any InnoDB semaphores, except btr_search_latch if
trx->has_search_latch.
sync_thread_levels_empty(): Unused function; remove.
trx_t: Remove mysql_thread_id and mysql_process_no.
srv_slot_t: Remove id and handle.
row_search_for_mysql(), srv_conc_enter_innodb(),
srv_conc_force_enter_innodb(), srv_conc_force_exit_innodb(),
srv_conc_exit_innodb(), srv_suspend_mysql_thread: Assert
!sync_thread_levels_nonempty_trx().
rb:634 approved by Sunny Bains
Do not disable InnoDB inlining when UNIV_DEBUG is defined. The
inlining is now solely controlled by the preprocessor symbol
UNIV_MUST_NOT_INLINE and by any compiler options.
mtr_memo_contains(): Add an explicit type conversion from void*, so
that the function can be compiled by a C++ compiler. Previously, this
function was never seen by the C++ compiler, because it is only
present in UNIV_DEBUG builds and InnoDB inlining used to be disabled.
buf_flush_validate_skip(): A wrapper that skips most calls of
buf_flush_validate_low(). Invoked by debug assertions in
buf_flush_insert_into_flush_list() and buf_flush_remove().
fil_validate_skip(): A wrapper that skips most calls of
fil_validate(). Invoked by debug assertions in fil_io() and fil_io_wait().
os_aio_validate_skip(): A wrapper that skips most calls of
os_aio_validate(). Invoked by debug assertions in
os_aio_func(), os_aio_windows_handle() and os_aio_simulated_handle.
os_get_os_version(): Only include this function if __WIN__ is defined.
sync_array_deadlock_step(): Slight optimizations. This function is a
major CPU consumer in UNIV_SYNC_DEBUG builds.
Rename buf_pool_watch, buf_pool_mutex, buf_pool_zip_mutex
to buf_pool->watch, buf_pool->mutex, buf_pool->zip_mutex
in comments. Refer to buf_pool->flush_list_mutex instead of
flush_list_mutex.
Remove obsolete declarations of buf_pool_mutex and buf_pool_zip_mutex.
Additional fixes in 5.5:
ibuf_set_del_mark(): Add diagnostics when setting a buffered delete-mark fails.
ibuf_delete(): Correct a misleading comment about non-found records.
rec_print(): Add a const qualifier to the index parameter.
Bug #56680 wrong InnoDB results from a case-insensitive covering index
row_search_for_mysql(): When a secondary index record might not be
visible in the current transaction's read view and we consult the
clustered index and optionally some undo log records, return the
relevant columns of the clustered index record to MySQL instead of the
secondary index record.
ibuf_insert_to_index_page_low(): New function, refactored from
ibuf_insert_to_index_page().
ibuf_insert_to_index_page(): When we are inserting a record in place
of a delete-marked record and some fields of the record differ, update
that record just like row_ins_sec_index_entry_by_modify() would do.
btr_cur_update_alloc_zip(): Make the function public.
mysql_row_templ_t: Add clust_rec_field_no.
row_sel_store_mysql_rec(), row_sel_push_cache_row_for_mysql(): Add the
flag rec_clust, for returning data at clust_rec_field_no instead of
rec_field_no. Resurrect the debug assertion that the record not be
marked for deletion. (Bug #55626)
[UNIV_DEBUG || UNIV_IBUF_DEBUG] ibuf_debug, buf_page_get_gen(),
buf_flush_page_try():
Implement innodb_change_buffering_debug=1 for evicting pages from the
buffer pool, so that change buffering will be attempted more
frequently.
row_search_for_mysql(): When a secondary index record might not be
visible in the current transaction's read view and we consult the
clustered index and optionally some undo log records, return the
relevant columns of the clustered index record to MySQL instead of the
secondary index record.
REC_INFO_DELETED_FLAG: Move the definition from rem0rec.ic to rem0rec.h.
ibuf_insert_to_index_page_low(): New function, refactored from
ibuf_insert_to_index_page().
ibuf_insert_to_index_page(): When we are inserting a record in place
of a delete-marked record and some fields of the record differ, update
that record just like row_ins_sec_index_entry_by_modify() would do.
mysql_row_templ_t: Add clust_rec_field_no.
row_sel_store_mysql_rec(), row_sel_push_cache_row_for_mysql(): Add the
flag rec_clust, for returning data at clust_rec_field_no instead of
rec_field_no. Resurrect the debug assertion that the record not be
marked for deletion. (Bug #55626)
buf_LRU_free_block(): Refactored from
buf_LRU_search_and_free_block(). This is needed for the
innodb_change_buffering_debug diagnostics.
[UNIV_DEBUG || UNIV_IBUF_DEBUG] ibuf_debug, buf_page_get_gen(),
buf_flush_page_try():
Implement innodb_change_buffering_debug=1 for evicting pages from the
buffer pool, so that change buffering will be attempted more
frequently.
Fix compiler warning:
buf/buf0flu.c: In function 'buf_flush_batch':
buf/buf0flu.c:844:9: error: variable 'old_page_count' set but not used [-Werror=unused-but-set-variable]
pages that it wants to flush then we should honor that value as in
not going beyond that in our eagerness to flush the neighbors of
the selected victim.
In order to allow thread schedulers to be dynamically loaded,
it is necessary to make the following changes to the server:
- Two new service interfaces
- Modifications to InnoDB to inform the thread scheduler of state changes.
- Changes to the VIO subsystem for checking if data is available on a socket.
- Elimination of remains of the old thread pool implementation.
The two new service interfaces introduces are:
my_thread_scheduler
A service interface to register a thread
scheduler.
thd_wait
A service interface to inform thread scheduler
that the thread is about to start waiting.
In addition, the patch adds code that:
- Add a call to thd_wait for table locks in mysys
thd_lock.c by introducing a set function that
can be used to set a callback to be used when
waiting on a lock and resuming from waiting.
- Calling the mysys set function from the server
to set the callbacks correctly.
buf_flush_insert_into_flush_list(),
buf_flush_insert_sorted_into_flush_list(),
buf_flush_post_to_doublewrite_buf(): Check that the page is initialized.
was skipped because another flush batch was active. This is to
ensure that the when we return success then it is guaranteed that
all pages up to the lsn_limit have been flushed to the disk.
bzr branch mysql-5.1-performance-version mysql-trunk # Summit
cd mysql-trunk
bzr merge mysql-5.1-innodb_plugin # which is 5.1 + Innodb plugin
bzr rm innobase # remove the builtin
Next step: build, test fixes.
innodb-5.1-ss1318
innodb-5.1-ss1330
innodb-5.1-ss1332
innodb-5.1-ss1340
Fixes:
- Bug #21409: Incorrect result returned when in READ-COMMITTED with query_cache ON
At low transaction isolation levels we let each consistent read set
its own snapshot.
- Bug #23666: strange Innodb_row_lock_time_% values in show status; also millisecs wrong
On Windows ut_usectime returns secs and usecs relative to the UNIX
epoch (which is Jan, 1 1970).
- Bug #25494: LATEST DEADLOCK INFORMATION is not always cleared
lock_deadlock_recursive(): When the search depth or length is exceeded,
rewind lock_latest_err_file and display the two transactions at the
point of aborting the search.
- Bug #25927: Foreign key with ON DELETE SET NULL on NOT NULL can crash server
Prevent ALTER TABLE ... MODIFY ... NOT NULL on columns for which
there is a foreign key constraint ON ... SET NULL.
- Bug #26835: Repeatable corruption of utf8-enabled tables inside InnoDB
The bug could be reproduced as follows:
Define a table so that the first column of the clustered index is
a VARCHAR or a UTF-8 CHAR in a collation where sequences of bytes
of differing length are considered equivalent.
Insert and delete a record. Before the delete-marked record is
purged, insert another record whose first column is of different
length but equivalent to the first record. Under certain conditions,
the insertion can be incorrectly performed as update-in-place.
Likewise, an operation that could be done as update-in-place can
unnecessarily be performed as delete and insert, but that would not
cause corruption but merely degraded performance.
Fixes:
- Bug #24712: SHOW TABLE STATUS for file-per-table showing incorrect time fields
- Bug #24386: Performance degradation caused by instrumentation in mutex_struct
- Bug #24190: many exportable definitions of field_in_record_is_null
- Bug #21468: InnoDB crash during recovery with corrupted data pages: XA bug?
Bugs fixed:
- Bug #20877: InnoDB data dictionary memory footprint is too big
- Bug #13544: Second delete of same row in transaction illustrates non-optimal locking
- Bug #20791: valgrind errors in InnoDB
Bugs fixed:
- Bug #20791 valgrind errors in InnoDB
Remove Valgrind warning of Bug #20791 : in new database
creation, we read the doublewrite buffer magic number from
uninitialized memory; the code worked because it was extremely
unlikely that the memory would contain the magic number
- Bug #21784 DROP TABLE crashes 5.1.12-pre if concurrent
queries on the table
remove update_thd() in ::store_lock()
Also includes numerous coding style fixes, etc. See file-level
comments for details.
Fixed BUG#19542 "InnoDB doesn't increase the Handler_read_prev couter".
Fixed BUG#19609 "Case sensitivity of innodb_data_file_path gives stupid error".
Fixed BUG#19727 "InnoDB crashed server and crashed tables are ot recoverable".
Also:
* Remove remnants of the obsolete concept of memoryfixing tables and indexes.
* Remove unused dict_table_LRU_trim().
* Remove unused 'trx' parameter from dict_table_get_on_id_low(),
dict_table_get(), dict_table_get_and_increment_handle_count().
* Add a normal linked list implementation.
* Add a work queue implementation.
* Add 'level' parameter to mutex_create() and rw_lock_create().
Remove mutex_set_level() and rw_lock_set_level().
* Rename SYNC_LEVEL_NONE to SYNC_LEVEL_VARYING.
* Add support for bound ids in InnoDB's parser.
* Define UNIV_BTR_DEBUG for enabling consistency checks of
FIL_PAGE_NEXT and FIL_PAGE_PREV when accessing sibling
pages of B-tree indexes.
btr_validate_level(): Check the validity of the doubly linked
list formed by FIL_PAGE_NEXT and FIL_PAGE_PREV.
* Adapt InnoDB to the new tablename to filename encoding in MySQL 5.1.
ut_print_name(), ut_print_name1(): Add parameter 'table_id' for
distinguishing names of tables from other identifiers.
New: innobase_convert_from_table_id(), innobase_convert_from_id(),
innobase_convert_from_filename(), innobase_get_charset.
dict_accept(), dict_scan_id(), dict_scan_col(), dict_scan_table_name(),
dict_skip_word(), dict_create_foreign_constraints_low(): Add
parameter 'cs' so that isspace() can be replaced with my_isspace(),
whose operation depends on the connection character set.
dict_scan_id(): Convert identifier to UTF-8.
dict_str_starts_with_keyword(): New extern function, to replace
dict_accept() in row_search_for_mysql().
mysql_get_identifier_quote_char(): Replaced with innobase_print_identifier().
ha_innobase::create(): Remove the thd->convert_strin() call. Pass the
statement to InnoDB in the connection character set and let InnoDB
convert the identifier to UTF-8.
* Add max_row_size to dict_table_t.
* btr0cur.c
btr_copy_externally_stored_field(): Only set the 'offset' variable
when needed.
* buf0buf.c
buf_page_io_complete(): Write to the error log if the page number or
the space id o the disk do not match those in memory. Also write to
the error log if a page was read from the doublewrite buffer. The
doublewrite buffer should be only read by the lower-level function
fil_io() at database startup.
* dict0dict.c
dict_scan_table_name(): Remove fallback to differently encoded name
when the table is not found. The encoding is handled at a higher level.
* ha_innodb.cc
Increment statistic counter in ha_innobase::index_prev() (bug 19542).
Add innobase_convert_string wrapper function and a new file
ha_prototypes.h.
innobase_print_identifier(): Remove TODO comment before calling
get_quote_char_for_identifier(). That function apparently assumes
the identifier to be encoded in UTF-8.
* ibuf0ibuf.c|h
ibuf_count_get(), ibuf_counts[], ibuf_count_inited(): Define these
only #ifdef UNIV_IBUF_DEBUG. Previously, when compiled without
UNIV_IBUF_DEBUG, invoking ibuf_count_get() would crash InnoDB.
The function is only being called #ifdef UNIV_IBUF_DEBUG.
* innodb.result
Adjust the results for changes in the foreign key error messages.
* mem0mem.c|h
New: mem_heap_dup(), mem_heap_printf(), mem_heap_cat().
* os0file.c
Check the page trailers also after writing to disk. This improves
chances of diagnosing bug 18886.
os_file_check_page_trailers(): New function for checking that the
two copies of the LSN stamped on the page match.
os_aio_simulated_handle(): Call os_file_check_page_trailers()
before and after os_file_write().
* row0mysql.c
Move trx_commit_for_mysql(trx) calls before calls to
row_mysql_unlock_data_dictionary(trx) (bug 19727).
* row0sel.c
row_fetch_print(): Handle SQL NULL values without crashing.
row_sel_store_mysql_rec(): Remove useless call to rec_get_nth_field
when handling an externally stored column.
Fetch externally stored fields when using InnoDB's internal SQL
parser.
Optimize BLOB selects by using prebuilt->blob_heap directly instead
of first reading BLOB data to a temporary heap and then copying it
to prebuilt->blob_heap.
* srv0srv.c
srv_master_thread(): Remove unreachable code.
* srv0start.c
srv_parse_data_file_paths_and_sizes(): Accept lower-case 'm' and
'g' as abbreviations of megabyte and gigabyte (bug 19609).
srv_parse_megabytes(): New fuction.
* ut0dbg.c|h
Implement InnoDB assertions (ut_a and ut_error) with abort() when
the code is compiled with GCC 3 or later on other platforms than
Windows or Netware. Also disable the variable ut_dbg_stop_threads
and the function ut_dbg_stop_thread() i this case, unless
UNIV_SYC_DEBUG is defined. This should allow the compiler to
generate more compact code for assertions.
* ut0list.c|h
Add ib_list_create_heap().
Fixed BUGS:
#3300: "UPDATE statement with no index column in where condition locks
all rows"
Implement semi-consistent read to reduce lock conflicts at the cost
of breaking serializability.
ha_innobase::unlock_row(): reset the "did semi consistent read" flag
ha_innobase::was_semi_consistent_read(),
ha_innobase::try_semi_consistent_read(): new methods
row_prebuilt_t, row_create_prebuilt(): add field row_read_type for
keeping track of semi-consistent reads
row_vers_build_for_semi_consistent_read(),
row_sel_build_committed_vers_for_mysql(): new functions
row_search_for_mysql(): implement semi-consistent reads
#9802: "Foreign key checks disallow alter table".
Added test cases.
#12456: "Cursor shows incorrect data - DML does not affect,
probably caching"
This patch implements a high-granularity read view to be used with
cursors. In this high-granularity consistent read view modifications
done by the creating transaction after the cursor is created or
future transactions are not visible. But those modifications that
transaction did before the cursor was created are visible.
#12701: "Support >4GB buffer pool and log files on 64-bit Windows"
Do not call os_file_create_tmpfile() at runtime. Instead, create all
tempfiles at startup and guard access to them with mutexes.
#13778: "If FOREIGN_KEY_CHECKS=0, one can create inconsistent FOREIGN KEYs".
When FOREIGN_KEY_CHECKS=0 we still need to check that datatypes between
foreign key references are compatible.
#14189: "VARBINARY and BINARY variables: trailing space ignored with InnoDB"
innobase_init(): Assert that
DATA_MYSQL_BINARY_CHARSET_COLL == my_charset_bin.number.
dtype_get_pad_char(): Do not pad VARBINARY or BINARY columns.
row_ins_cascade_calc_update_vec(): Refuse ON UPDATE CASCADE when trying
to change the length of a VARBINARY column that refers to or is referenced
by a BINARY column. BINARY columns are no longer padded on comparison,
and thus they cannot be padded on storage either.
#14747: "Race condition can cause btr_search_drop_page_hash_index() to crash"
Note that buf_block_t::index should be protected by btr_search_latch
or an s-latch or x-latch on the index page.
btr_search_drop_page_hash_index(): Read block->index while holding
btr_search_latch and use the cached value in the loop. Remove some
redundant assertions.
#15108: "mysqld crashes when innodb_log_file_size is set > 4G"
#15308: "Problem of Order with Enum Column in Primary Key"
#15550: "mysqld crashes in printing a FOREIGN KEY error in InnoDB"
row_ins_foreign_report_add_err(): When printing the parent record,
use the index in the parent table rather than the index in the child table.
#15653: "Slow inserts to InnoDB if many thousands of .ibd files"
Keep track on unflushed modifications to file spaces. When there are tens
of thousands of file spaces, flushing all files in fil_flush_file_spaces()
would be very slow.
fil_flush_file_spaces(): Only flush unflushed file spaces.
fil_space_t, fil_system_t: Add a list of unflushed spaces.
#15991: "innodb-file-per-table + symlink database + rename = cr"
os_file_handle_error(): Map the error codes EXDEV, ENOTDIR, and EISDIR
to the new code OS_FILE_PATH_ERROR. Treat this code as OS_FILE_PATH_ERROR.
This fixes the crash on RENAME TABLE when the .ibd file is a symbolic link
to a different file system.
#16157: "InnoDB crashes when main location settings are empty"
This patch is from Heikki.
#16298: "InnoDB segfaults in INSERTs in upgrade of 4.0 -> 5.0 tables
with VARCHAR BINARY"
dict_load_columns(): Set the charset-collation code
DATA_MYSQL_BINARY_CHARSET_COLL for those binary string columns
that lack a charset-collation code, i.e., the tables were created
with an older version of MySQL/InnoDB than 4.1.2.
#16229: "MySQL/InnoDB uses full explicit table locks in trigger processing"
Take a InnoDB table lock only if user has explicitly requested a table
lock. Added some additional comments to store_lock() and external_lock().
#16387: "InnoDB crash when dropping a foreign key <table>_ibfk_0"
Do not mistake TABLENAME_ibfk_0 for auto-generated id.
dict_table_get_highest_foreign_id(): Ignore foreign constraint
identifiers starting with the pattern TABLENAME_ibfk_0.
#16582: "InnoDB: Error in an adaptive hash index pointer to page"
Account for a race condition when dropping the adaptive hash index
for a B-tree page.
btr_search_drop_page_hash_index(): Retry the operation if a hash index
with different parameters was built meanwhile. Add diagnostics for the
case that hash node pointers to the page remain.
btr_search_info_update_hash(), btr_search_info_update_slow():
Document the parameter "info" as in/out.
#16814: "SHOW INNODB STATUS format error in LATEST FOREIGN KEY ERROR
section"
Add a missing newline to the LAST FOREIGN KEY ERROR section in SHOW
INNODB STATUS output.
dict_foreign_error_report(): Always print a newline after invoking
dict_print_info_on_foreign_key_in_create_format().
#16827: "Better InnoDB error message if ibdata files omitted from my.cnf"
#17126: "CHECK TABLE on InnoDB causes a short hang during check of adaptive
hash"
CHECK TABLE blocking other queries, by releasing the btr_search_latch
periodically during the adaptive hash table validation.
#17405: "Valgrind: conditional jump or move depends on unititialised values"
buf_block_init(): Reset magic_n, buf_fix_count and io_fix to avoid
testing uninitialized variables.