FREED LOCK
ANALYIS
-------
In 5.5 code the lock_rec_block_validate() is called after releasing
the kernel mutex. There is a chance that the lock might be invalid so,
we are getting the valgrind error on invalid read on lock->index.
FIX
---
Fix would be to copy the lock->index when we are holding the kernel mutex
and then pass it to the lock_rec_block_validate(). This implementation
is present in 5.1 code.
[ Approved by sunny rb.no.oracle.com/rb/r/2152/ ]
Problem:
During the index intersect access method, the SQL layer will access one row,
that satisfies a set of conditions, using an index i1. And then it will try to
access the same row, with other set of conditions using the next index i2. If
the fetch from i2 fails (we are talking about an error situation here and not
simply an unmatched row situation), then it will unlock the row accessed via
i1. This will work in all situations except deadlock error.
When a deadlock happens, InnoDB will rollback the transaction. InnoDB intimates
the SQL layer about this through the THD::transaction_rollback_request member.
But this is not currently used by the SQL layer.
Solution:
When an error happens, the SQL layer must check the
THD::transaction_rollback_request member, before calling handler::unlock_row().
We have also added a debug assert in ha_innobase::unlock_row() checking that
it must be called only when the transaction is in active state.
rb#1773 approved by Marko and Sunny.
IN LOCK_VALIDATE()
rb://917
approved by: Marko Makela
In lock_validate() the limit is used to release the kernel_mutex during
the validation, to obey the latching order.
If we do the limit++ then we are rechecking the same lock most times on
each iteration because limit is being incremented by one and
<space, page_no> will nearly always be > limit. If we set the limit
correctly to (space, page+1) then we are actually making progress
during the iteration.
print page dump
buf_page_print(): Remove the ut_ad(0) from the beginning. Add two flags
(enum buf_page_print_flags) that can be bitwise-ORed together:
BUF_PAGE_PRINT_NO_CRASH:
Do not crash debug builds at the end of buf_page_print().
BUF_PAGE_PRINT_NO_FULL:
Do not print the full page dump. This can be useful when adding
diagnostic printout to flushing or to the doublewrite buffer.
trx_sys_doublewrite_init_or_restore_page(): Replace exit(1) with ut_error,
so that we can get a core dump if this extraordinary condition happens.
rb:924 approved by Sunny Bains
This fix does not remove the underlying cause of the assertion
failure. It just works around the problem, allowing a corrupted
secondary index to be fixed by DROP INDEX and CREATE INDEX (or in the
worst case, by re-creating the table).
ibuf_delete(): If the record to be purged is the last one in the page
or it is not delete-marked, refuse to purge it. Instead, write an
error message to the error log and let a debug assertion fail.
ibuf_set_del_mark(): If the record to be delete-marked is not found,
display some more information in the error log and let a debug
assertion fail.
row_undo_mod_del_unmark_sec_and_undo_update(),
row_upd_sec_index_entry(): Let a debug assertion fail when the record
to be delete-marked is not found.
buf_page_print(): Add ut_ad(0) so that corruption will be more
prominent in stress testing with debug binaries. Add ut_ad(0) here and
there where corruption is noticed.
btr_corruption_report(): Display some data on page_is_comp() mismatch.
btr_assert_not_corrupted(): A wrapper around btr_corruption_report().
Assert that page_is_comp() agrees with the table flags.
rb:911 approved by Inaam Rana
Fix a deadlock in the initial patch. lock_validate() must not hold the
lock system mutex while s-latching a block, because some functions,
such as lock_rec_convert_impl_to_expl(), may be already holding an x-latch
on the block that lock_validate() is interested in while attempting to
acquire the lock system mutex.
This deadlock was not caught by UNIV_SYNC_DEBUG because of
buf_block_dbg_add_level(block, SYNC_NO_ORDER_CHECK).
lock_clust_rec_some_has_impl(), row_get_rec_trx_id(),
lock_rec_queue_validate(), lock_table_other_has_incompatible(),
lock_table_has_to_wait_in_queue(), lock_table_queue_validate():
Add const qualifiers.
row_get_trx_id_offset(): Add const qualifiers. Keep the parameter rec
only in UNIV_DEBUG builds. Inline the function.
lock_rec_validate_page(): Take the buffer block as a parameter, to
avoid a buf_page_get_gen() call in most cases.
lock_rec_validate_page_low(): A version of lock_rec_validate_page()
that assumes that the lock system mutexes are already being held.
lock_rec_get_next_on_page_const(): A const variant of
lock_rec_get_next_on_page().
lock_validate(): Do not release the lock system mutex while
buffer-fixing the block for the lock_rec_validate_page() call.
Releasing the mutex apparently caused the assertion failure.
rb:665 approved by Sunny Bains
It is filling the error log when testing the debug version of the server.
The printout only seems to be useful when debugging a crash, not when
testing an instrumented version of the server.
Fix compiler warning:
lock/lock0lock.c: In function 'lock_print_info_all_transactions':
lock/lock0lock.c:4299:10: error: variable 'page' set but not used [-Werror=unused-but-set-variable]
------------------------------------------------------------
revno: 3495
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Wed 2010-06-02 13:37:14 +0300
message:
Bug#53674: InnoDB: Error: unlock row could not find a 4 mode lock on the record
In semi-consistent read, only unlock freshly locked non-matching records.
lock_rec_lock_fast(): Return LOCK_REC_SUCCESS,
LOCK_REC_SUCCESS_CREATED, or LOCK_REC_FAIL instead of TRUE/FALSE.
enum db_err: Add DB_SUCCESS_LOCKED_REC for indicating a successful
operation where a record lock was created.
lock_sec_rec_read_check_and_lock(),
lock_clust_rec_read_check_and_lock(), lock_rec_enqueue_waiting(),
lock_rec_lock_slow(), lock_rec_lock(), row_ins_set_shared_rec_lock(),
row_ins_set_exclusive_rec_lock(), sel_set_rec_lock(),
row_sel_get_clust_rec_for_mysql(): Return DB_SUCCESS_LOCKED_REC if a
new record lock was created. Adjust callers.
row_unlock_for_mysql(): Correct the function documentation.
row_prebuilt_t::new_rec_locks: Correct the documentation.
In semi-consistent read, only unlock freshly locked non-matching records.
Define DB_SUCCESS_LOCKED_REC for indicating a successful operation
where a record lock was created.
lock_rec_lock_fast(): Return LOCK_REC_SUCCESS,
LOCK_REC_SUCCESS_CREATED, or LOCK_REC_FAIL instead of TRUE/FALSE.
lock_sec_rec_read_check_and_lock(),
lock_clust_rec_read_check_and_lock(), lock_rec_enqueue_waiting(),
lock_rec_lock_slow(), lock_rec_lock(), row_ins_set_shared_rec_lock(),
row_ins_set_exclusive_rec_lock(), sel_set_rec_lock(),
row_sel_get_clust_rec_for_mysql(): Return DB_SUCCESS_LOCKED_REC if a
new record lock was created. Adjust callers.
row_unlock_for_mysql(): Correct the function documentation.
row_prebuilt_t::new_rec_locks: Correct the documentation.
------------------------------------------------------------
revno: 3421
revision-id: marko.makela@oracle.com-20100426131029-1ffja69h6n88q6bo
parent: marko.makela@oracle.com-20100426112609-f7lgl8crw4x4sfkk
committer: Marko M?kel? <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Mon 2010-04-26 16:10:29 +0300
message:
lock_rec_queue_validate(): Disable a bogus check that
a transaction that holds a lock on a clustered index record
also holds a lock on the secondary index record.
modified:
storage/innobase/lock/lock0lock.c 2@cee13dc7-1704-0410-992b-c9b4543f1246:trunk%2Flock%2Flock0lock.c
storage/innodb_plugin/lock/lock0lock.c 2@16c675df-0fcb-4bc9-8058-dcc011a37293:trunk%2Flock%2Flock0lock.c
------------------------------------------------------------
Detailed revision comments:
r6545 | jyang | 2010-02-03 03:57:32 +0200 (Wed, 03 Feb 2010) | 8 lines
branches/5.1: Fix bug #49001, "SHOW INNODB STATUS deadlock info
incorrect when deadlock detection aborts". Print the correct
lock owner when recursive function lock_deadlock_recursive()
exceeds its maximum depth LOCK_MAX_DEPTH_IN_DEADLOCK_CHECK.
rb://217, approved by Marko.
and also applying 5.1-ss6355
Detailed revision comments:
r6324 | jyang | 2009-12-17 06:54:24 +0200 (Thu, 17 Dec 2009) | 8 lines
branches/5.1: Fix bug #47814 - Diagnostics are frequently not
printed after a long lock wait in InnoDB. Separate out the
lock wait timeout check thread from monitor information
printing thread.
rb://200 Approved by Marko.
r6349 | marko | 2009-12-22 11:09:54 +0200 (Tue, 22 Dec 2009) | 3 lines
branches/5.1: lock_print_info_summary(): Remove a reference to
innobase_mysql_end_print_arbitrary_thd() that should have been
removed in r6347 when removing the function.
r6350 | marko | 2009-12-22 11:11:09 +0200 (Tue, 22 Dec 2009) | 1 line
branches/5.1: Remove an obsolete declaration of LOCK_thread_count.
bzr branch mysql-5.1-performance-version mysql-trunk # Summit
cd mysql-trunk
bzr merge mysql-5.1-innodb_plugin # which is 5.1 + Innodb plugin
bzr rm innobase # remove the builtin
Next step: build, test fixes.
problems
1) BUG#39320 - innodb crash in file btr/btr0pcur.c line 217 with
innodb_locks_unsafe_for_binlog
2) Fixes bug in multi-table semi consistent reads.
3) Fixes email address from dev@innodb.com to innodb_dev_ww@oracle.com
4) Fixes warning message generated by main.innodb test
Detailed revision comments:
r4399 | marko | 2009-03-12 09:38:05 +0200 (Thu, 12 Mar 2009) | 5 lines
branches/5.1: row_sel_get_clust_rec_for_mysql(): Store the cursor position
also for unlock_row(). (Bug #39320)
rb://96 approved by Heikki Tuuri.
r4400 | marko | 2009-03-12 10:06:44 +0200 (Thu, 12 Mar 2009) | 8 lines
branches/5.1: Fix a bug in multi-table semi-consistent reads.
Remember the acquired record locks per table handle (row_prebuilt_t)
rather than per transaction (trx_t), so that unlock_row should successfully
unlock all non-matching rows in multi-table operations.
This deficiency was found while investigating Bug #39320.
rb://94 approved by Heikki Tuuri.
r4481 | marko | 2009-03-19 15:01:48 +0200 (Thu, 19 Mar 2009) | 6 lines
branches/5.1: row_unlock_for_mysql(): Do not unlock records that were
modified by the current transaction. This bug was introduced or unmasked
in r4400.
rb://97 approved by Heikki Tuuri
r4573 | vasil | 2009-03-30 14:17:13 +0300 (Mon, 30 Mar 2009) | 4 lines
branches/5.1:
Fix email address from dev@innodb.com to innodb_dev_ww@oracle.com
r4574 | vasil | 2009-03-30 14:27:08 +0300 (Mon, 30 Mar 2009) | 38 lines
branches/5.1:
Restore the state of INNODB_THREAD_CONCURRENCY to silence this warning:
TEST RESULT TIME (ms)
------------------------------------------------------------
worker[1] Using MTR_BUILD_THREAD 250, with reserved ports 12500..12509
main.innodb [ pass ] 8803
MTR's internal check of the test case 'main.innodb' failed.
This means that the test case does not preserve the state that existed
before the test case was executed. Most likely the test case did not
do a proper clean-up.
This is the diff of the states of the servers before and after the
test case was executed:
mysqltest: Logging to '/tmp/autotest.sh-20090330_033000-5.1.5Hg8CY/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.log'.
mysqltest: Results saved in '/tmp/autotest.sh-20090330_033000-5.1.5Hg8CY/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result'.
mysqltest: Connecting to server localhost:12500 (socket /tmp/autotest.sh-20090330_033000-5.1.5Hg8CY/mysql-5.1/mysql-test/var/tmp/mysqld.1.sock) as 'root', connection 'default', attempt 0 ...
mysqltest: ... Connected.
mysqltest: Start processing test commands from './include/check-testcase.test' ...
mysqltest: ... Done processing test commands.
--- /tmp/autotest.sh-20090330_033000-5.1.5Hg8CY/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.result 2009-03-30 14:12:31.000000000 +0300
+++ /tmp/autotest.sh-20090330_033000-5.1.5Hg8CY/mysql-5.1/mysql-test/var/tmp/check-mysqld_1.reject 2009-03-30 14:12:41.000000000 +0300
@@ -99,7 +99,7 @@
INNODB_SUPPORT_XA ON
INNODB_SYNC_SPIN_LOOPS 20
INNODB_TABLE_LOCKS ON
-INNODB_THREAD_CONCURRENCY 8
+INNODB_THREAD_CONCURRENCY 16
INNODB_THREAD_SLEEP_DELAY 10000
INSERT_ID 0
INTERACTIVE_TIMEOUT 28800
mysqltest: Result content mismatch
not ok
r4576 | vasil | 2009-03-30 16:25:10 +0300 (Mon, 30 Mar 2009) | 4 lines
branches/5.1:
Revert a change to Makefile.am that I committed accidentally in c4574.
Bug #42152: Race condition in lock_is_table_exclusive()
Detailed revision comments:
r4005 | marko | 2009-01-20 16:22:36 +0200 (Tue, 20 Jan 2009) | 8 lines
branches/5.1: lock_is_table_exclusive(): Acquire kernel_mutex before
accessing table->locks and release kernel_mutex before returning from
the function. This fixes a portential race condition in the
"commit every 10,000 rows" in ALTER TABLE, CREATE INDEX, DROP INDEX,
and OPTIMIZE TABLE. (Bug #42152)
rb://80 approved by Heikki Tuuri
Bug#38231: Innodb crash in lock_reset_all_on_table() on TRUNCATE + LOCK / UNLOCK
branches/5.1:
Fix Bug#38231 Innodb crash in lock_reset_all_on_table() on TRUNCATE + LOCK / UNLOCK
In TRUNCATE TABLE and discard tablespace: do not remove table-level S
and X locks and do not assert on such locks not being wait locks.
Leave such locks alone.
Approved by: Heikki (rb://14)
Bug#37531, Bug#36941, Bug#36941, Bug#36942, Bug#38185.
Also include test case from Bug 34300 which was left out from earlier snapshot
(5.1-ss2387).
Also include fix for Bug #29507, "TRUNCATE shows to many rows effected", since
the fix for Bug 37531 depends on it.
Bug #16979: AUTO_INC lock in InnoDB works a table level lock
Add a table level counter that tracks the number of AUTOINC locks that are
pending and/or granted on a table. We peek at this value to determine whether
a transaction doing a simple INSERT in innodb_autoinc_lock_mode = 1, needs to
acquire the AUTOINC lock or not. This change is related to Bug# 16979.
Bug #27950: Duplicate entry error in auto-inc after mysqld restart
We check whether the AUTOINC sub-system has been initialized (first) by
holding the AUTOINC mutex and if initialization is required then we
initialize using our normal procedure.
Fixes:
- Bug #23710: crash_commit_before fails if innodb_file_per_table=1
- Bug #28254: innodb crash if shutdown during innodb_table_monitor is running
- Bug #28604: innodb_force_recovery restricts data dump
- Bug #29097: fsp_get_available_space_in_free_extents() is capped at 4TB
- Bug #29155: Innodb "Parallel recovery" is not prevented
After applying the snapshots, ensure that code conforms to the final version
of WL 3914.
It is signficant that, after these changes, InnoDB does not define MYSQL_SERVER,
and can be built as an independent storage engine plugin.
Fixes:
Bug#9709: InnoDB inconsistensy causes "Operating System Error 32/33"
Bug#18828: If InnoDB runs out of undo slots, it returns misleading 'table is full'
Bug#20090: InnoDB: Error: trying to declare trx to enter InnoDB
Bug#20352: Make ibuf_contract_for_n_pages tunable
Bug#21101: Wrong error on exceeding max row size for InnoDB table
Bug#21293: Deadlock detection prefers to kill long running FOR UPDATE queries
Bug#22819: SHOW INNODB STATUS crashes the server with an assertion failure under high load
Bug#25078: Make the replication thread to ignore innodb_thread_concurrency
Bug#25645: Assertion failure in file srv0srv.c
Bug#28138: indexing column prefixes produces corruption in InnoDB
innodb-5.1-ss1318
innodb-5.1-ss1330
innodb-5.1-ss1332
innodb-5.1-ss1340
Fixes:
- Bug #21409: Incorrect result returned when in READ-COMMITTED with query_cache ON
At low transaction isolation levels we let each consistent read set
its own snapshot.
- Bug #23666: strange Innodb_row_lock_time_% values in show status; also millisecs wrong
On Windows ut_usectime returns secs and usecs relative to the UNIX
epoch (which is Jan, 1 1970).
- Bug #25494: LATEST DEADLOCK INFORMATION is not always cleared
lock_deadlock_recursive(): When the search depth or length is exceeded,
rewind lock_latest_err_file and display the two transactions at the
point of aborting the search.
- Bug #25927: Foreign key with ON DELETE SET NULL on NOT NULL can crash server
Prevent ALTER TABLE ... MODIFY ... NOT NULL on columns for which
there is a foreign key constraint ON ... SET NULL.
- Bug #26835: Repeatable corruption of utf8-enabled tables inside InnoDB
The bug could be reproduced as follows:
Define a table so that the first column of the clustered index is
a VARCHAR or a UTF-8 CHAR in a collation where sequences of bytes
of differing length are considered equivalent.
Insert and delete a record. Before the delete-marked record is
purged, insert another record whose first column is of different
length but equivalent to the first record. Under certain conditions,
the insertion can be incorrectly performed as update-in-place.
Likewise, an operation that could be done as update-in-place can
unnecessarily be performed as delete and insert, but that would not
cause corruption but merely degraded performance.
Fixes:
- Bug #24712: SHOW TABLE STATUS for file-per-table showing incorrect time fields
- Bug #24386: Performance degradation caused by instrumentation in mutex_struct
- Bug #24190: many exportable definitions of field_in_record_is_null
- Bug #21468: InnoDB crash during recovery with corrupted data pages: XA bug?