This fix was accidentally pushed to mysql-5.1 after the 5.1.59 clone-off in
bzr revision id marko.makela@oracle.com-20110829081642-z0w992a0mrc62s6w
with the fix of Bug#12704861 Corruption after a crash during BLOB update
but not merged to mysql-5.5 and upwards.
In the Barracuda formats, the clustered index record no longer
contains a prefix of off-page columns. Because of this, the undo log
must contain these prefixes, so that purge and multi-versioning will
continue to work. However, this also means that an undo log record can
become too big to fit in an undo log page. (It is a limitation of the
undo log that undo records cannot span across multiple pages.)
In case the checks for undo log size fail when CREATE TABLE or CREATE
INDEX is executed, we need a fallback that blocks a modification
operation when the undo log record would exceed the maximum size.
trx_undo_free_last_page_func(): Renamed from trx_undo_free_page_in_rollback().
Define the trx_t parameter only in debug builds.
trx_undo_free_last_page(): Wrapper for trx_undo_free_last_page_func().
Pass the trx_t parameter only in debug builds.
trx_undo_truncate_end_func(): Renamed from trx_undo_truncate_end().
Define the trx_t parameter only in debug builds. Rewrite a for(;;) loop
as a while loop for clarity.
trx_undo_truncate_end(): Wrapper for from trx_undo_truncate_end_func().
Pass the trx_t parameter only in debug builds.
trx_undo_erase_page_end(): Return TRUE if the page was non-empty
to begin with. Refuse to erase empty pages.
trx_undo_report_row_operation(): If the page for which the undo log
was too big was empty, free the undo page and return DB_TOO_BIG_RECORD.
rb:749 approved by Inaam Rana
Also addressed issues in bug #11745133, where we could mark a table
corrupted instead of crashing the server when found a corrupted buffer/page
if the table created with innodb_file_per_table on.
DB_COL_APPEARS_TWICE_IN_INDEX: Remove. This condition is already
checked and reported by MySQL before passing the index definition to
the storage engine.
row_create_index_for_mysql(): Remove the redundant check for
DB_COL_APPEARS_TWICE_IN_INDEX. When enforcing the column prefix index
limit, invoke dict_mem_index_free(index) to plug the memory leak. In
the loop, use index->n_def instead of dict_index_get_n_fields(index),
because the latter would be 0 for indexes that have not been copied to
the data dictionary cache.
innodb-use-sys-malloc.test:
Add test cases for attempting to trigger the error checks in
row_create_index_for_mysql(). Before MySQL 5.5 and WL#5743, the leak
is only reproducible if ha_innobase::max_supported_key_part_length()
returned a higher limit than the one used in
row_create_index_for_mysql().
In MySQL 5.5 and later, the leak is reproducible with
innodb_large_prefix=true.
rb:688 approved by Jimmy Yang
With this change, the index prefix column length lifted from 767 bytes
to 3072 bytes if "innodb_large_prefix" is set to "true".
rb://603 approved by Marko
Bug #11766501: Multiple RBS break the get rseg with mininum trx_t::no code during purge
Bug# 59291 changes:
Main problem is that truncating the UNDO log at the completion of every
trx_purge() call is expensive as the number of rollback segments is increased.
We truncate after a configurable amount of pages. The innodb_purge_batch_size
parameter is used to control when InnoDB does the actual truncate. The truncate
is done once after 128 (or TRX_SYS_N_RSEGS iterations). In other words we
truncate after purge 128 * innodb_purge_batch_size. The smaller the batch
size the quicker we truncate.
Introduce a new parameter that allows how many rollback segments to use for
storing REDO information. This is really step 1 in allowing complete control
to the user over rollback space management.
New parameters:
i) innodb_rollback_segments = number of rollback_segments to use
(default is now 128) dynamic parameter, can be changed anytime.
Currently there is little benefit in changing it from the default.
Optimisations in the patch.
i. Change the O(n) behaviour of trx_rseg_get_on_id() to O(log n)
Backported from 5.6. Refactor some of the binary heap code.
Create a new include/ut0bh.ic file.
ii. Avoid truncating the rollback segments after every purge.
Related changes that were moved to a separate patch:
i. Purge should not do any flushing, only wait for space to be free so that
it only does purging of records unless it is held up by a long running
transaction that is preventing it from progressing.
ii. Give the purge thread preference over transactions when acquiring the
rseg->mutex during commit. This to avoid purge blocking unnecessarily
when getting the next rollback segment to purge.
Bug #11766501 changes:
Add the rseg to the min binary heap under the cover of the kernel mutex and
the binary heap mutex. This ensures the ordering of the min binary heap.
The two changes have to be committed together because they share the same
that fixes both issues.
rb://567 Approved by: Inaam Rana.
bzr branch mysql-5.1-performance-version mysql-trunk # Summit
cd mysql-trunk
bzr merge mysql-5.1-innodb_plugin # which is 5.1 + Innodb plugin
bzr rm innobase # remove the builtin
Next step: build, test fixes.
Bug #36819: ut_usectime does not handle errors from gettimeofday
Detailed revision comments:
r2480 | vasil | 2008-05-27 11:40:07 +0300 (Tue, 27 May 2008) | 11 lines
branches/5.1:
Fix Bug#36819 ut_usectime does not handle errors from gettimeofday
by retrying gettimeofday() several times if it fails in ut_usectime().
If it fails on all calls then return error to the caller to be handled
at higher level.
Update the variable innodb_row_lock_time_max in SHOW STATUS output only
if ut_usectime() was successful.
Fixes the following bugs:
Bug #30706: SQL thread on slave is allowed to block client queries when slave load is high
Add (innodb|innobase|srv)_replication_delay MySQL config parameter.
Bug #30888: Innodb table + stored procedure + row deletion = server crash
While adding code for the low level read of the AUTOINC value from the index,
the case for MEDIUM ints which are 3 bytes was missed triggering an
assertion.
Bug #30907: Regression: "--innodb_autoinc_lock_mode=0" (off) not same as older releases
We don't rely on *first_value to be 0 when checking whether
get_auto_increment() has been invoked for the first time in a multi-row
INSERT. We instead use trx_t::n_autoinc_rows. Initialize trx::n_autoinc_rows
inside ha_innobase::start_stmt() too.
Bug #31444: "InnoDB: Error: MySQL is freeing a thd" in innodb_mysql.test
ha_innobase::external_lock(): Update prebuilt->mysql_has_locked and
trx->n_mysql_tables_in_use only after row_lock_table_for_mysql() returns
DB_SUCCESS. A timeout on LOCK TABLES would lead to an inconsistent state,
which would cause trx_free() to print a warning.
Bug #31494: innodb + 5.1 + read committed crash, assertion
Set an error code when a deadlock occurs in semi-consistent read.
After applying the snapshots, ensure that code conforms to the final version
of WL 3914.
It is signficant that, after these changes, InnoDB does not define MYSQL_SERVER,
and can be built as an independent storage engine plugin.
Fixes:
Bug#9709: InnoDB inconsistensy causes "Operating System Error 32/33"
Bug#18828: If InnoDB runs out of undo slots, it returns misleading 'table is full'
Bug#20090: InnoDB: Error: trying to declare trx to enter InnoDB
Bug#20352: Make ibuf_contract_for_n_pages tunable
Bug#21101: Wrong error on exceeding max row size for InnoDB table
Bug#21293: Deadlock detection prefers to kill long running FOR UPDATE queries
Bug#22819: SHOW INNODB STATUS crashes the server with an assertion failure under high load
Bug#25078: Make the replication thread to ignore innodb_thread_concurrency
Bug#25645: Assertion failure in file srv0srv.c
Bug#28138: indexing column prefixes produces corruption in InnoDB
innodb-5.1-ss1318
innodb-5.1-ss1330
innodb-5.1-ss1332
innodb-5.1-ss1340
Fixes:
- Bug #21409: Incorrect result returned when in READ-COMMITTED with query_cache ON
At low transaction isolation levels we let each consistent read set
its own snapshot.
- Bug #23666: strange Innodb_row_lock_time_% values in show status; also millisecs wrong
On Windows ut_usectime returns secs and usecs relative to the UNIX
epoch (which is Jan, 1 1970).
- Bug #25494: LATEST DEADLOCK INFORMATION is not always cleared
lock_deadlock_recursive(): When the search depth or length is exceeded,
rewind lock_latest_err_file and display the two transactions at the
point of aborting the search.
- Bug #25927: Foreign key with ON DELETE SET NULL on NOT NULL can crash server
Prevent ALTER TABLE ... MODIFY ... NOT NULL on columns for which
there is a foreign key constraint ON ... SET NULL.
- Bug #26835: Repeatable corruption of utf8-enabled tables inside InnoDB
The bug could be reproduced as follows:
Define a table so that the first column of the clustered index is
a VARCHAR or a UTF-8 CHAR in a collation where sequences of bytes
of differing length are considered equivalent.
Insert and delete a record. Before the delete-marked record is
purged, insert another record whose first column is of different
length but equivalent to the first record. Under certain conditions,
the insertion can be incorrectly performed as update-in-place.
Likewise, an operation that could be done as update-in-place can
unnecessarily be performed as delete and insert, but that would not
cause corruption but merely degraded performance.
Bugs fixed:
- Bug #20791 valgrind errors in InnoDB
Remove Valgrind warning of Bug #20791 : in new database
creation, we read the doublewrite buffer magic number from
uninitialized memory; the code worked because it was extremely
unlikely that the memory would contain the magic number
- Bug #21784 DROP TABLE crashes 5.1.12-pre if concurrent
queries on the table
remove update_thd() in ::store_lock()
Also includes numerous coding style fixes, etc. See file-level
comments for details.
Changes in SQL parser:
* Change default mode of SELECT from "lock in share mode"
to "consistent read".
* Remove support from SELECT for specifying "consistent read".
* Add support in SELECT for specifying "lock in share mode".
* Change all uses of SQL parser to specify "lock in share mode".
* Modify syntax so that the only valid top-level statement is
a procedure definition, since it's the only one that actually
works.
* Add support for lock waits.
Fixed BUG#19542 "InnoDB doesn't increase the Handler_read_prev couter".
Fixed BUG#19609 "Case sensitivity of innodb_data_file_path gives stupid error".
Fixed BUG#19727 "InnoDB crashed server and crashed tables are ot recoverable".
Also:
* Remove remnants of the obsolete concept of memoryfixing tables and indexes.
* Remove unused dict_table_LRU_trim().
* Remove unused 'trx' parameter from dict_table_get_on_id_low(),
dict_table_get(), dict_table_get_and_increment_handle_count().
* Add a normal linked list implementation.
* Add a work queue implementation.
* Add 'level' parameter to mutex_create() and rw_lock_create().
Remove mutex_set_level() and rw_lock_set_level().
* Rename SYNC_LEVEL_NONE to SYNC_LEVEL_VARYING.
* Add support for bound ids in InnoDB's parser.
* Define UNIV_BTR_DEBUG for enabling consistency checks of
FIL_PAGE_NEXT and FIL_PAGE_PREV when accessing sibling
pages of B-tree indexes.
btr_validate_level(): Check the validity of the doubly linked
list formed by FIL_PAGE_NEXT and FIL_PAGE_PREV.
* Adapt InnoDB to the new tablename to filename encoding in MySQL 5.1.
ut_print_name(), ut_print_name1(): Add parameter 'table_id' for
distinguishing names of tables from other identifiers.
New: innobase_convert_from_table_id(), innobase_convert_from_id(),
innobase_convert_from_filename(), innobase_get_charset.
dict_accept(), dict_scan_id(), dict_scan_col(), dict_scan_table_name(),
dict_skip_word(), dict_create_foreign_constraints_low(): Add
parameter 'cs' so that isspace() can be replaced with my_isspace(),
whose operation depends on the connection character set.
dict_scan_id(): Convert identifier to UTF-8.
dict_str_starts_with_keyword(): New extern function, to replace
dict_accept() in row_search_for_mysql().
mysql_get_identifier_quote_char(): Replaced with innobase_print_identifier().
ha_innobase::create(): Remove the thd->convert_strin() call. Pass the
statement to InnoDB in the connection character set and let InnoDB
convert the identifier to UTF-8.
* Add max_row_size to dict_table_t.
* btr0cur.c
btr_copy_externally_stored_field(): Only set the 'offset' variable
when needed.
* buf0buf.c
buf_page_io_complete(): Write to the error log if the page number or
the space id o the disk do not match those in memory. Also write to
the error log if a page was read from the doublewrite buffer. The
doublewrite buffer should be only read by the lower-level function
fil_io() at database startup.
* dict0dict.c
dict_scan_table_name(): Remove fallback to differently encoded name
when the table is not found. The encoding is handled at a higher level.
* ha_innodb.cc
Increment statistic counter in ha_innobase::index_prev() (bug 19542).
Add innobase_convert_string wrapper function and a new file
ha_prototypes.h.
innobase_print_identifier(): Remove TODO comment before calling
get_quote_char_for_identifier(). That function apparently assumes
the identifier to be encoded in UTF-8.
* ibuf0ibuf.c|h
ibuf_count_get(), ibuf_counts[], ibuf_count_inited(): Define these
only #ifdef UNIV_IBUF_DEBUG. Previously, when compiled without
UNIV_IBUF_DEBUG, invoking ibuf_count_get() would crash InnoDB.
The function is only being called #ifdef UNIV_IBUF_DEBUG.
* innodb.result
Adjust the results for changes in the foreign key error messages.
* mem0mem.c|h
New: mem_heap_dup(), mem_heap_printf(), mem_heap_cat().
* os0file.c
Check the page trailers also after writing to disk. This improves
chances of diagnosing bug 18886.
os_file_check_page_trailers(): New function for checking that the
two copies of the LSN stamped on the page match.
os_aio_simulated_handle(): Call os_file_check_page_trailers()
before and after os_file_write().
* row0mysql.c
Move trx_commit_for_mysql(trx) calls before calls to
row_mysql_unlock_data_dictionary(trx) (bug 19727).
* row0sel.c
row_fetch_print(): Handle SQL NULL values without crashing.
row_sel_store_mysql_rec(): Remove useless call to rec_get_nth_field
when handling an externally stored column.
Fetch externally stored fields when using InnoDB's internal SQL
parser.
Optimize BLOB selects by using prebuilt->blob_heap directly instead
of first reading BLOB data to a temporary heap and then copying it
to prebuilt->blob_heap.
* srv0srv.c
srv_master_thread(): Remove unreachable code.
* srv0start.c
srv_parse_data_file_paths_and_sizes(): Accept lower-case 'm' and
'g' as abbreviations of megabyte and gigabyte (bug 19609).
srv_parse_megabytes(): New fuction.
* ut0dbg.c|h
Implement InnoDB assertions (ut_a and ut_error) with abort() when
the code is compiled with GCC 3 or later on other platforms than
Windows or Netware. Also disable the variable ut_dbg_stop_threads
and the function ut_dbg_stop_thread() i this case, unless
UNIV_SYC_DEBUG is defined. This should allow the compiler to
generate more compact code for assertions.
* ut0list.c|h
Add ib_list_create_heap().
Fixed BUGS:
#3300: "UPDATE statement with no index column in where condition locks
all rows"
Implement semi-consistent read to reduce lock conflicts at the cost
of breaking serializability.
ha_innobase::unlock_row(): reset the "did semi consistent read" flag
ha_innobase::was_semi_consistent_read(),
ha_innobase::try_semi_consistent_read(): new methods
row_prebuilt_t, row_create_prebuilt(): add field row_read_type for
keeping track of semi-consistent reads
row_vers_build_for_semi_consistent_read(),
row_sel_build_committed_vers_for_mysql(): new functions
row_search_for_mysql(): implement semi-consistent reads
#9802: "Foreign key checks disallow alter table".
Added test cases.
#12456: "Cursor shows incorrect data - DML does not affect,
probably caching"
This patch implements a high-granularity read view to be used with
cursors. In this high-granularity consistent read view modifications
done by the creating transaction after the cursor is created or
future transactions are not visible. But those modifications that
transaction did before the cursor was created are visible.
#12701: "Support >4GB buffer pool and log files on 64-bit Windows"
Do not call os_file_create_tmpfile() at runtime. Instead, create all
tempfiles at startup and guard access to them with mutexes.
#13778: "If FOREIGN_KEY_CHECKS=0, one can create inconsistent FOREIGN KEYs".
When FOREIGN_KEY_CHECKS=0 we still need to check that datatypes between
foreign key references are compatible.
#14189: "VARBINARY and BINARY variables: trailing space ignored with InnoDB"
innobase_init(): Assert that
DATA_MYSQL_BINARY_CHARSET_COLL == my_charset_bin.number.
dtype_get_pad_char(): Do not pad VARBINARY or BINARY columns.
row_ins_cascade_calc_update_vec(): Refuse ON UPDATE CASCADE when trying
to change the length of a VARBINARY column that refers to or is referenced
by a BINARY column. BINARY columns are no longer padded on comparison,
and thus they cannot be padded on storage either.
#14747: "Race condition can cause btr_search_drop_page_hash_index() to crash"
Note that buf_block_t::index should be protected by btr_search_latch
or an s-latch or x-latch on the index page.
btr_search_drop_page_hash_index(): Read block->index while holding
btr_search_latch and use the cached value in the loop. Remove some
redundant assertions.
#15108: "mysqld crashes when innodb_log_file_size is set > 4G"
#15308: "Problem of Order with Enum Column in Primary Key"
#15550: "mysqld crashes in printing a FOREIGN KEY error in InnoDB"
row_ins_foreign_report_add_err(): When printing the parent record,
use the index in the parent table rather than the index in the child table.
#15653: "Slow inserts to InnoDB if many thousands of .ibd files"
Keep track on unflushed modifications to file spaces. When there are tens
of thousands of file spaces, flushing all files in fil_flush_file_spaces()
would be very slow.
fil_flush_file_spaces(): Only flush unflushed file spaces.
fil_space_t, fil_system_t: Add a list of unflushed spaces.
#15991: "innodb-file-per-table + symlink database + rename = cr"
os_file_handle_error(): Map the error codes EXDEV, ENOTDIR, and EISDIR
to the new code OS_FILE_PATH_ERROR. Treat this code as OS_FILE_PATH_ERROR.
This fixes the crash on RENAME TABLE when the .ibd file is a symbolic link
to a different file system.
#16157: "InnoDB crashes when main location settings are empty"
This patch is from Heikki.
#16298: "InnoDB segfaults in INSERTs in upgrade of 4.0 -> 5.0 tables
with VARCHAR BINARY"
dict_load_columns(): Set the charset-collation code
DATA_MYSQL_BINARY_CHARSET_COLL for those binary string columns
that lack a charset-collation code, i.e., the tables were created
with an older version of MySQL/InnoDB than 4.1.2.
#16229: "MySQL/InnoDB uses full explicit table locks in trigger processing"
Take a InnoDB table lock only if user has explicitly requested a table
lock. Added some additional comments to store_lock() and external_lock().
#16387: "InnoDB crash when dropping a foreign key <table>_ibfk_0"
Do not mistake TABLENAME_ibfk_0 for auto-generated id.
dict_table_get_highest_foreign_id(): Ignore foreign constraint
identifiers starting with the pattern TABLENAME_ibfk_0.
#16582: "InnoDB: Error in an adaptive hash index pointer to page"
Account for a race condition when dropping the adaptive hash index
for a B-tree page.
btr_search_drop_page_hash_index(): Retry the operation if a hash index
with different parameters was built meanwhile. Add diagnostics for the
case that hash node pointers to the page remain.
btr_search_info_update_hash(), btr_search_info_update_slow():
Document the parameter "info" as in/out.
#16814: "SHOW INNODB STATUS format error in LATEST FOREIGN KEY ERROR
section"
Add a missing newline to the LAST FOREIGN KEY ERROR section in SHOW
INNODB STATUS output.
dict_foreign_error_report(): Always print a newline after invoking
dict_print_info_on_foreign_key_in_create_format().
#16827: "Better InnoDB error message if ibdata files omitted from my.cnf"
#17126: "CHECK TABLE on InnoDB causes a short hang during check of adaptive
hash"
CHECK TABLE blocking other queries, by releasing the btr_search_latch
periodically during the adaptive hash table validation.
#17405: "Valgrind: conditional jump or move depends on unititialised values"
buf_block_init(): Reset magic_n, buf_fix_count and io_fix to avoid
testing uninitialized variables.