BY A CONCURRENT TRANSACTIO
The member function QUICK_RANGE_SELECT::init_ror_merged_scan() performs
a table handler clone. Innodb does not provide a clone operation.
The ha_innobase::clone() is not there. The handler::clone() does not
take care of the ha_innobase->prebuilt->select_lock_type. Because of
this what happens is that for one index we do a locking read, and
for the other index we were doing a non-locking (consistent) read.
The patch introduces ha_innobase::clone() member function.
It is implemented similar to ha_myisam::clone(). It calls the
base class handler::clone() and then does any additional operation
required. I am setting the ha_innobase->prebuilt->select_lock_type
correctly.
rb://1060 approved by Marko
The test case must insert all the records using a single transaction. Otherwise the test
case takes more than 15 minutes and will time out in pb2 and mtr.
truncating, inserting the same set of rows. When a table is
re-created with the same set of rows, the data file size must
not grow.
rb:968
Approved by Marko.
There are two threads. In one thread, dml operation is going on
involving cascaded update operation. In another thread, alter
table add foreign key constraint is happening. Under these
circumstances, it is possible for the dml thread to access a
dict_foreign_t object that has been freed by the ddl thread.
The debug sync test case provides the sequence of operations.
Without fix, the test case will crash the server (because of
newly added assert). With fix, the alter table stmt will return
an error message.
Backporting the fix from MySQL 5.5 to 5.1
rb:961
rb:947
Suppress innodb_bug34300 from failing if InnoDB prints:
120221 11:05:03 InnoDB: ERROR: the age of the last checkpoint is 9439048,
InnoDB: which exceeds the log group capacity 9433498.
by default the log capacity is 2 log files, 5 MB each.
GRACEFUL SHUTDOWN
During startup mysql picks up .frm files from the tmpdir directory and
tries to drop those tables in the storage engine.
The problem is that when tmpdir ends in / then ha_innobase::delete_table()
is passed a string like "/var/tmp//#sql123", then it wrongly normalizes it
to "/#sql123" and calls row_drop_table_for_mysql() which of course fails
to delete the table entry from the InnoDB dictionary cache.
ha_innobase::delete_table() returns an error but nevertheless mysql wipes
away the .frm file and the entry in the InnoDB dictionary cache remains
orphaned with no easy way to remove it.
The "no easy" way to remove it is to create a similar temporary table again,
copy its .frm file to tmpdir under "#sql123.frm" and restart mysqld with
tmpdir=/var/tmp (no trailing slash) - this way mysql will pick the .frm file
after restart and will try to issue drop table for "/var/tmp/#sql123"
(notice do double slash), ha_innobase::delete_table() will normalize it to
"tmp/#sql123" and row_drop_table_for_mysql() will successfully remove the
table entry from the dictionary cache.
The solution is to fix normalize_table_name_low() to normalize things like
"/var/tmp//table" correctly to "tmp/table".
This patch also adds a test function which invokes
normalize_table_name_low() with various inputs to make sure it works
correctly and a mtr test that calls this test function.
Reviewed by: Marko (http://bur03.no.oracle.com/rb/r/929/)
If we meet DB_TOO_MANY_CONCURRENT_TRXS during the execution tab_create_graph from row_create_table_for_mysql(), .ibd file for the table should be created already but was not deleted for the error handling.
rb:875 approved by Jimmy Yang
CREATE TABLE bug13510739 (c INTEGER NOT NULL, PRIMARY KEY (c)) ENGINE=INNODB;
INSERT INTO bug13510739 VALUES (1), (2), (3), (4);
DELETE FROM bug13510739 WHERE c=2;
HANDLER bug13510739 OPEN;
HANDLER bug13510739 READ `primary` = (2);
HANDLER bug13510739 READ `primary` NEXT; <-- crash
The bug is that in the particular testcase row_search_for_mysql() picked up
a delete-marked record and quit, leaving the cursor non-positioned state and
on the subsequent 'get next' call the code crashed because of the
non-positioned cursor.
In row0sel.cc (line numbers from mysql-trunk):
4653 if (rec_get_deleted_flag(rec, comp)) {
...
4679 if (index == clust_index && unique_search) {
4680
4681 err = DB_RECORD_NOT_FOUND;
4682
4683 goto normal_return;
4684 }
it quit from here, not storing the cursor position.
In contrast, if the record=2 is not found at all (e.g. sleep(1) after DELETE
to let the purge wipe it away completely) then 'get = 2' does find record=3
and quits from here:
4366 if (0 != cmp_dtuple_rec(search_tuple, rec, offsets)) {
...
4394 btr_pcur_store_position(pcur, &mtr);
4395
4396 err = DB_RECORD_NOT_FOUND;
4397 #if 0
4398 ut_print_name(stderr, trx, FALSE, index->name);
4399 fputs(" record not found 3\n", stderr);
4400 #endif
4401
4402 goto normal_return;
Another fix could be to extend the condition on line 4366 to hold only if
seach_tuple matches rec AND if rec is not delete marked.
Notice that in the above test case if we wait about 1 second somewhere after
DELETE and before 'get = 2', then the testcase does not crash and returns 4
instead. Not sure if this is the correct behavior, but this bugfix removes
the crash and makes the code return what it also returns in the non-crashing
case (if rec=2 is not found during 'get = 2', e.g. we have sleep(1) there).
Approved by: Marko (http://bur03.no.oracle.com/rb/r/863/)
The counter handler_read_key (SSV::ha_read_key_count) is incremented
incorrectly.
The mysql server maintains a per thread system_status_var (SSV)
object. This object contains among other things the counter
SSV::ha_read_key_count. The purpose of this counter is to measure the
number of requests to read a row based on a key (or the number of
index lookups).
This counter was wrongly incremented in the
ha_innobase::innobase_get_index(). The fix removes
this increment statement (for both innodb and innodb_plugin).
The various callers of the innobase_get_index() was checked to
determine if anybody must increment this counter (if they first call
innobase_get_index() and then perform an index lookup). It was found
that no caller of innobase_get_index() needs to worry about the
SSV::ha_read_key_count counter.
The bug was accidentally fixed by fixing
Bug#11759688 52020: InnoDB can still deadlock on just INSERT...ON DUPLICATE KEY
a.k.a. the reintroduction of
Bug#7975 deadlock without any locking, simple select and update
a.k.a. Bug#7975 deadlock without any locking, simple select and update
Bug#7975 was reintroduced when the storage engine API was made
pluggable in MySQL 5.1. Instead of looking at thd->lex directly, we
rely on handler::extra(). But, we were looking at the wrong extra()
flag, and we were ignoring the TRX_DUP_REPLACE flag in places where we
should obey it.
innodb_replace.test: Add tests for hopefully all affected statement
types, so that bug should never ever resurface. This kind of tests
should have been added when fixing Bug#7975 in MySQL 5.0.3 in the
first place.
rb:806 approved by Sunny Bains
In the ON UPDATE CASCADE clause of FOREIGN KEY constraints, the
calculated update vector was not fully initialized. This bug was
introduced in the InnoDB Plugin when implementing support for
ROW_FORMAT=DYNAMIC.
Additionally, the data type information was not initialized, but
apparently it has never been needed in this case. Nevertheless, it is
not good programming practice to pass uninitialized values around.
calc_row_difference(): Declare the update field uninitialized in
Valgrind. Copy the data type information as well, except when the
field is SQL NULL. In the built-in InnoDB, initialize
ufield->extern_storage = FALSE (an initialization bug that had gone
unnoticed this far). The InnoDB Plugin and later have this flag to
dfield_t and have always initialized it properly.
row_ins_cascade_calc_update_vec(): Reduce the scope of some
pointers. Initialize orig_len. (This caused the bug in InnoDB Plugin
and later.)
row_ins_foreign_check_on_constraint(): Simplify a condition. Declare
the update vector uninitialized.
rb:771 approved by Jimmy Yang
PARENT FOR OTHER ONE
Do not try to lookup key_nr'th key in 'table' because there may not be such
a key there. key_nr is the number of the key in the _child_ table name, not
in the parent table.
Instead just print the fields of the record that are covered by the first key
defined on the parent table.
This bug gets a better fix in MySQL 5.6, which is too risky for 5.1 and 5.5.
Approved by: Jon Olav Hauglid (via IM)
HA_INNOBASE::UPDATE_ROW, TEMPORARY TABLE, TABLE LOCK".
Attempt to update an InnoDB temporary table under LOCK TABLES
led to assertion failure in both debug and production builds
if this temporary table was explicitly locked for READ. The
same scenario works fine for MyISAM temporary tables.
The assertion failure was caused by discrepancy between lock
that was requested on the rows of temporary table at LOCK TABLES
time and by update operation. Since SQL-layer requested a
read-lock at LOCK TABLES time InnoDB engine assumed that upcoming
statements which are going to be executed under LOCK TABLES will
only read table and therefore should acquire only S-lock.
An update operation broken this assumption by requesting X-lock.
Possible approaches to fixing this problem are:
1) Skip locking of temporary tables as locking doesn't make any
sense for connection-local objects.
2) Prohibit changing of temporary table locked by LOCK TABLES ...
READ.
Unfortunately both of these approaches have drawbacks which make
them unviable for stable versions of server.
So this patch takes another approach and changes code in such way
that LOCK TABLES for a temporary table will always request write
lock. In 5.1 version of this patch switch from read lock to write
lock is done inside of InnoDBs handler methods as doing it on
SQL-layer causes compatibility troubles with FLUSH TABLES WITH
READ LOCK.
causes future shutdown hang
InnoDB would hang on shutdown if any XA transactions exist in the
system in the PREPARED state. This has been masked by the fact that
MySQL would roll back any PREPARED transaction on shutdown, in the
spirit of Bug #12161 Xa recovery and client disconnection.
[mysql-test-run] do_shutdown_server: Interpret --shutdown_server 0 as
a request to kill the server immediately without initiating a
shutdown procedure.
xid_cache_insert(): Initialize XID_STATE::rm_error in order to avoid a
bogus error message on XA ROLLBACK of a recovered PREPARED transaction.
innobase_commit_by_xid(), innobase_rollback_by_xid(): Free the InnoDB
transaction object after rolling back a PREPARED transaction.
trx_get_trx_by_xid(): Only consider transactions whose
trx->is_prepared flag is set. The MySQL layer seems to prevent
attempts to roll back connected transactions that are in the PREPARED
state from another connection, but it is better to play it safe. The
is_prepared flag was introduced in the InnoDB Plugin.
trx_n_prepared: A new counter, counting the number of InnoDB
transactions in the PREPARED state.
logs_empty_and_mark_files_at_shutdown(): On shutdown, allow
trx_n_prepared transactions to exist in the system.
trx_undo_free_prepared(), trx_free_prepared(): New functions, to free
the memory objects of PREPARED transactions on shutdown. This is not
needed in the built-in InnoDB, because it would collect all allocated
memory on shutdown. The InnoDB Plugin needs this because of
innodb_use_sys_malloc.
trx_sys_close(): Invoke trx_free_prepared() on all remaining
transactions.
Bug#59410 read uncommitted: unlock row could not find a 3 mode lock
on the record
This bug is present only in 5.6 but I am adding the test case to earlier
versions to ensure it never appears in earlier versions too.
primary_key_no == 0".
Attempt to create InnoDB table with non-nullable column of
geometry type having an unique key with length 12 on it and
with some other candidate key led to server crash due to
assertion failure in both non-debug and debug builds.
The problem was that such a non-candidate key could have
been sorted as the first key in table/.FRM, before any legit
candidate keys. This resulted in assertion failure in InnoDB
engine which assumes that primary key should either be the
first key in table/.FRM or should not exist at all.
The reason behind such an incorrect sorting was an wrong
value of Create_field::key_length member for geometry field
(which was set to its pack_length == 12) which confused code
in mysql_prepare_create_table(), so it would skip marking
such key as a key with partial segments.
This patch fixes the problem by ensuring that this member
gets the same value of Create_field::key_length member as
for other blob fields (from which geometry field class is
inherited), and as result unique keys on geometry fields
are correctly marked as having partial segments.
"rows examined" estimates". This change implements "innodb_stats_method"
with options of "nulls_equal", "nulls_unequal" and "null_ignored".
rb://553 approved by Marko
--Bug#52157 various crashes and assertions with multi-table update, stored function
--Bug#54475 improper error handling causes cascading crashing failures in innodb/ndb
--Bug#57703 create view cause Assertion failed: 0, file .\item_subselect.cc, line 846
--Bug#57352 valgrind warnings when creating view
--Recently discovered problem when a nested materialized derived table is used
before being populated and it leads to incorrect result
We have several modes when we should disable subquery evaluation.
The reasons for disabling are different. It could be
uselessness of the evaluation as in case of 'CREATE VIEW'
or 'PREPARE stmt', or we should disable subquery evaluation
if tables are not locked yet as it happens in bug#54475, or
too early evaluation of subqueries can lead to wrong result
as it happened in Bug#19077.
Main problem is that if subquery items are treated as const
they are evaluated in ::fix_fields(), ::fix_length_and_dec()
of the parental items as a lot of these methods have
Item::val_...() calls inside.
We have to make subqueries non-const to prevent unnecessary
subquery evaluation. At the moment we have different methods
for this. Here is a list of these modes:
1. PREPARE stmt;
We use UNCACHEABLE_PREPARE flag.
It is set during parsing in sql_parse.cc, mysql_new_select() for
each SELECT_LEX object and cleared at the end of PREPARE in
sql_prepare.cc, init_stmt_after_parse(). If this flag is set
subquery becomes non-const and evaluation does not happen.
2. CREATE|ALTER VIEW, SHOW CREATE VIEW, I_S tables which
process FRM files
We use LEX::view_prepare_mode field. We set it before
view preparation and check this flag in
::fix_fields(), ::fix_length_and_dec().
Some bugs are fixed using this approach,
some are not(Bug#57352, Bug#57703). The problem here is
that we have a lot of ::fix_fields(), ::fix_length_and_dec()
where we use Item::val_...() calls for const items.
3. Derived tables with subquery = wrong result(Bug19077)
The reason of this bug is too early subquery evaluation.
It was fixed by adding Item::with_subselect field
The check of this field in appropriate places prevents
const item evaluation if the item have subquery.
The fix for Bug19077 fixes only the problem with
convert_constant_item() function and does not cover
other places(::fix_fields(), ::fix_length_and_dec() again)
where subqueries could be evaluated.
Example:
CREATE TABLE t1 (i INT, j BIGINT);
INSERT INTO t1 VALUES (1, 2), (2, 2), (3, 2);
SELECT * FROM (SELECT MIN(i) FROM t1
WHERE j = SUBSTRING('12', (SELECT * FROM (SELECT MIN(j) FROM t1) t2))) t3;
DROP TABLE t1;
4. Derived tables with subquery where subquery
is evaluated before table locking(Bug#54475, Bug#52157)
Suggested solution is following:
-Introduce new field LEX::context_analysis_only with the following
possible flags:
#define CONTEXT_ANALYSIS_ONLY_PREPARE 1
#define CONTEXT_ANALYSIS_ONLY_VIEW 2
#define CONTEXT_ANALYSIS_ONLY_DERIVED 4
-Set/clean these flags when we perform
context analysis operation
-Item_subselect::const_item() returns
result depending on LEX::context_analysis_only.
If context_analysis_only is set then we return
FALSE that means that subquery is non-const.
As all subquery types are wrapped by Item_subselect
it allow as to make subquery non-const when
it's necessary.
Auto increment value wraps when performing a bulk insert with
auto_increment_increment and auto_increment_offset greater than
one.
The fix:
If overflow happened then return MAX_ULONGLONG value as an
indication of overflow and check this before storing the
value into the field in update_auto_increment().
In case of low memory sort buffer QUICK_INDEX_MERGE_SELECT creates
temporary file where is stores row ids which meet QUICK_SELECT ranges
except of clustered pk range, clustered range is processed separately.
In init_read_record we check if temporary file is used and choose
appropriate record access method. It does not take into account that
temporary file contains partial result in case of QUICK_INDEX_MERGE_SELECT
with clustered pk range.
The fix is always to use rr_quick if QUICK_INDEX_MERGE_SELECT
with clustered pk range is used.
ALTER TABLE RENAME, DISABLE KEYS.
The code of ALTER TABLE RENAME, DISABLE KEYS could
issue a commit while holding LOCK_open mutex.
This is a regression introduced by the fix for
Bug 54453.
This failed an assert guarding us against a potential
deadlock with connections trying to execute
FLUSH TABLES WITH READ LOCK.
The fix is to move acquisition of LOCK_open outside
the section that issues ha_autocommit_or_rollback().
LOCK_open is taken to protect against concurrent
operations with .frms and the table definition
cache, and doesn't need to cover the call to commit.
A test case added to innodb_mysql.test.
The patch is to be null-merged to 5.5, which
already has 54453 null-merged to it.
row_search_for_mysql(): When a secondary index record might not be
visible in the current transaction's read view and we consult the
clustered index and optionally some undo log records, return the
relevant columns of the clustered index record to MySQL instead of the
secondary index record.
REC_INFO_DELETED_FLAG: Move the definition from rem0rec.ic to rem0rec.h.
ibuf_insert_to_index_page_low(): New function, refactored from
ibuf_insert_to_index_page().
ibuf_insert_to_index_page(): When we are inserting a record in place
of a delete-marked record and some fields of the record differ, update
that record just like row_ins_sec_index_entry_by_modify() would do.
mysql_row_templ_t: Add clust_rec_field_no.
row_sel_store_mysql_rec(), row_sel_push_cache_row_for_mysql(): Add the
flag rec_clust, for returning data at clust_rec_field_no instead of
rec_field_no. Resurrect the debug assertion that the record not be
marked for deletion. (Bug #55626)
buf_LRU_free_block(): Refactored from
buf_LRU_search_and_free_block(). This is needed for the
innodb_change_buffering_debug diagnostics.
[UNIV_DEBUG || UNIV_IBUF_DEBUG] ibuf_debug, buf_page_get_gen(),
buf_flush_page_try():
Implement innodb_change_buffering_debug=1 for evicting pages from the
buffer pool, so that change buffering will be attempted more
frequently.
In order to fix this bug we need to distinguish whether ha_innobase::info()
has been called from ::analyze() or not. Rename ::info() to ::info_low()
and add a boolean parameter that tells whether the call is from ::analyze()
or not. Create a new simple ::info() that just calls
::info_low(false => not called from analyze). From ::analyze() instead of
::info() call ::info_low(true => called from analyze).
Approved by: Jimmy (rb://487)
Just remove the check whether the file is "too big".
A similar code exists in ha_innobase::update_table_comment() but that
method does not seem to be used.