AFTER A ROW IS READ
Approved by: Sunny Bains rb://2425
Don't release concurrency tickets when asked to release
btr_search_latch. This is a 5.5 only bug. It is already
fixed in 5.6 upwards.
== Analysis ==
Both change buffer pages and on-disk indexes pages are marked as
FIL_PAGE_INDEX. So all ibuf index pages will classify as INDEX with NULL
table_name and index_name.
== Solution ==
A new page type for ibuf data pages named I_S_PAGE_TYPE_IBUF is defined. All
these pages whose index_id equal (DICT_IBUF_ID_MIN + IBUF_SPACE_ID) will
classify as IBUF_DATA instead of INDEX in INNODB_BUFFER_PAGE
and INNODB_BUFFER_PAGE_LRU.
This fix is only for IS reporting, both on-disk and buffer pool structures
keep unchanged.
Approved by both Marko and Jimmy. rb#2334
TRANSACTION ROLLBACK
Problem:
=======
"prepare_commit_mutex" is acquired during "innobase_xa_prepare"
and it is freed only in "innobase_commit". After prepare,
if the commit operation fails the transaction is rolled back
but the mutex is not released.
Analysis:
========
During transaction commit process transaction is prepared and
the "prepare_commit_mutex" is acquired to preserve the order
of commit. After prepare write to binlog is initiated.
File: sql/handler.cc
if (error || (is_real_trans && xid &&
-----> (error= !(cookie= tc_log->log_xid(thd, xid)))))
{
ha_rollback_trans(thd, all);
In the above code "tc_log->log_xid" operation fails.
When the write to binlog fails the transaction is rolled back
with out freeing the mutex. A subsequent "INSERT" operation
tries to acquire the same mutex during its commit process
and the server aborts.
Fix:
===
"prepare_commit_mutex" is freed during "innobase_rollback".
storage/innobase/handler/ha_innodb.cc:
Added code to free "prepare_commit_mutex"
Bug #16754901 PARS_INFO_FREE NOT CALLED IN DICT_CREATE_ADD_FOREIGN_TO_DICTIONARY
Problem:
There are two situations here. The constraint name is explicitly
given by the user and the constraint name is automatically generated
by InnoDB. In the case of generated constraint name, it is formed by
adding table name as prefix. The table names are stored internally in
my_charset_filename. In the case of constraint name explicitly given
by the user, it is stored in UTF8 format itself. So, in some
situations the constraint name is in utf8 and in some situations it is
in my_charset_filename format. Hence this problem.
Solution:
Always store the foreign key constraint name in UTF-8 even when
automatically generated.
Bug #16754901 PARS_INFO_FREE NOT CALLED IN DICT_CREATE_ADD_FOREIGN_TO_DICTIONARY
Problem:
There was a memory leak in the function dict_create_add_foreign_to_dictionary().
The allocated pars_info_t object is not freed in the error code path.
Solution:
Allocate the pars_info_t object after the error checking.
rb#2368 in review
eliminate a race condition over recv_sys->n_addrs which might result in a database corruption
in recovery, without reporting a recovery error.
recv_recover_page_func(): move the code segment that decrements recv_sys->n_addrs
to the end of the function, after the call to mtr_commit()
rb://2282 approved by Inaam
After a clean shutdown, InnoDB will not check the *.ibd file headers,
for maximum performance. This is unchanged before and after this
patch.
What this fix addresses is the case when crash recovery is
needed. Previously, InnoDB could load a corrupted tablespace file.
buf_page_is_corrupted(): Add the parameter check_lsn.
fil_check_first_page(): New function, to perform a consistency check
on the first page of a file. This can be overridden by setting
innodb_force_recovery.
fil_read_first_page(), fil_open_single_table_tablespace(),
fil_load_single_table_tablespace(): Invoke fil_check_first_page().
open_or_create_data_files(): Check the status of
fil_open_single_table_tablespace().
rb#2352 approved by Jimmy Yang
OPENING MISSING PARTITION
In the ha_innobase::open() call, for normal tables, there is no retry logic.
But for partitioned tables, there is a retry logic introduced as fix for:
http://bugs.mysql.com/bug.php?id=33349https://support.mysql.com/view.php?id=21080
The Bug#33349, does not provide sufficient information to analyze the original
problem. The original problem reported by bug#33349 is also minor (just an
annoyance and no loss of functionality). Most importantly, the retry logic
has been introduced without any associated test case.
So we are removing the retry logic for partitioned tables. When the original
problem occurs, a different solution will be explored.
TABLE/KEY RELATIONS
The DICT_FK_MAX_RECURSIVE_LOAD was reduced from 250 to 33 in rb#2058.
But in optimized build, this recursive depth is still too deep and
resulted in stack overflow. So reducing this depth to 20 now.
TABLE/KEY RELATIONS
Problem:
When there are many tables, linked together through the foreign key
constraints, then loading one table will recursively open other tables. This
can sometimes lead to thread stack overflow. In such situations the server
will exit.
I see the stack overflow problem when the thread_stack is 196608 (the default
value for 32-bit systems). I don't see the problem when the thread_stack is
set to 262144 (the default value for 64-bit systems).
Solution:
Currently, in InnoDB, there is a macro DICT_FK_MAX_RECURSIVE_LOAD which defines
the maximum number of tables that will be loaded recursively because of foreign
key relations. This is currently set to 250. We can reduce this number to 33
(anything more than 33 does not solve the problem for the default value). We
can keep it small enough so that thread stack overflow does not happen for the
default values. Reducing the DICT_FK_MAX_RECURSIVE_LOAD will not affect the
functionality of InnoDB. The tables will eventually be loaded.
rb#2058 approved by Marko
TABLE/KEY RELATIONS
Problem:
When there are many tables, linked together through the foreign key
constraints, then loading one table will recursively open other tables. This
can sometimes lead to thread stack overflow. In such situations the server
will exit.
I see the stack overflow problem when the thread_stack is 196608 (the default
value for 32-bit systems). I don't see the problem when the thread_stack is
set to 262144 (the default value for 64-bit systems).
Solution:
Currently, in InnoDB, there is a macro DICT_FK_MAX_RECURSIVE_LOAD which defines
the maximum number of tables that will be loaded recursively because of foreign
key relations. This is currently set to 250. We can reduce this number to 33
(anything more than 33 does not solve the problem for the default value). We
can keep it small enough so that thread stack overflow does not happen for the
default values. Reducing the DICT_FK_MAX_RECURSIVE_LOAD will not affect the
functionality of InnoDB. The tables will eventually be loaded.
rb#2058 approved by Marko
UPDATES
After checking that the table has changed too much in
row_update_statistics_if_needed() and calling dict_update_statistics(),
also check if the same condition holds after acquiring the table stats
latch. This is to avoid multiple threads concurrently entering and
executing the stats update code.
Approved by: Marko (rb:2186)
FREED LOCK
ANALYIS
-------
In 5.5 code the lock_rec_block_validate() is called after releasing
the kernel mutex. There is a chance that the lock might be invalid so,
we are getting the valgrind error on invalid read on lock->index.
FIX
---
Fix would be to copy the lock->index when we are holding the kernel mutex
and then pass it to the lock_rec_block_validate(). This implementation
is present in 5.1 code.
[ Approved by sunny rb.no.oracle.com/rb/r/2152/ ]
IBUF, FREE SPACE MANAGEMENT
ibuf_merge_or_delete_for_page(): Declare the user index page latched
for UNIV_SYNC_DEBUG after opening the change buffer cursor. This
should avoid the bogus latching order violation.
ibuf_delete_rec(): Add assertions to the callers, checking that the
mini-transaction was committed when the function returned TRUE. This
is a non-functional change, just clarifying the code.
rb#2136 approved by Kevin Lewis
MEM_HEAP_CREATE_BLOCK()
PROBLEM
-------
If we give start mysqld with the option --innodb_log_buffer_size=50GB
,then mem_area_alloc() function fails to allocate memory and returns
NULL.In debug version we assert at this point,but there is no check in
release version and we get a segmentation fault.
FIX
---
Added a log message saying that we are unable to allocate memory.
After this message we assert.
[Approved by Kevin http://rb.no.oracle.com/rb/r/2065 ]
INSERT WITH SAME VALUES
Problem:
When a transaction is in READ COMMITTED isolation level, gap locks are still
taken in the secondary index, when row is inserted. This happens when the
secondary index is scanned for duplicate.
The function row_ins_scan_sec_index_for_duplicate() always calls the
function row_ins_set_shared_rec_lock() with LOCK_ORDINARY irrespective of
the transaction isolation level.
Solution:
The function row_ins_scan_sec_index_for_duplicate() calls the
function row_ins_set_shared_rec_lock() with LOCK_ORDINARY or
LOCK_REC_NOT_GAP based on the transaction isolation level.
rb://2035 approved by Krunal and Marko
This is a deadlock that will also be fixed in the server by
Bug #11844915 - HANG IN THDVAR MUTEX ACQUISITION.
So this is a simple alternate method of fixing the same problem,
but from within InnoDB.
The simple change is to make rename table start a transaction
before locking dict_sys->mutex since thd_supports_xa() can call
THDVAR which can lock a mutex, LOCK_global_system_variables, that
is used in the server by many other activities. At least one of
those, sys_var::update(), can call back into InnoDB and try to
lock dict_sys->mutex while holding LOCK_global_system_variables.
The other bug fix for 11844915 eliminates the use of
LOCK_global_system_variables for calls to THDVAR.
Approved by marko in http://rb.no.oracle.com/rb/r/2000/
FROM SHOW CREATE
Problem: The length of the internally generated foreign key name
is not checked.
Solution: The length of the internally generated foreign key name is
checked. If it is greater than the allowed limit, an error message
is reported. Also, the constraint name is printed in the same manner
as the table name, using the system charset information.
rb://1969 approved by Marko.
DEREFERENCING UT_DBG_NULL_PTR
The abort() call is standard C but InnoDB only uses it in GCC
environments. UT_DBG_USE_ABORT is not defined the code crashed
by dereferencing a null pointer instead of calling abort().
Other code throughout MySQL including ndb, sql, mysys and other
places call abort() directly.
This bug also affects innodb.innodb_bug14147491.test which fails
randomly on windows because of this issue.
Approved by marko in http://rb.no.oracle.com/rb/r/1936/
I/O IS ASYNC
rb://1934
approved by: Mikael Ronstrom (over email)
When submitting AIO read request don't signal that the thread is
about to wait on DISKIO
WITH --SKIP-INNODB
Description
-----------
If the server is started with skip-innodb or InnoDB otherwise fails to
start, any one of these queries will crash the server:
For (5.5)
SELECT * FROM INFORMATION_SCHEMA.INNODB_BUFFER_PAGE;
SELECT * FROM INFORMATION_SCHEMA.INNODB_BUFFER_PAGE_LRU;
SELECT * FROM INFORMATION_SCHEMA.INNODB_BUFFER_POOL_STATS;
In (5.6+) ,following queries will also crash the server.
SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES;
SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_INDEXES;
SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_COLUMNS;
SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_FIELDS;
SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_FOREIGN;
SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_FOREIGN_COLS;
SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS;
SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_DATAFILES;
SELECT * FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES;
FIX
----
When Innodb is not active we must prevent it from processing
these tables,so we return a warning saying that innodb is not
active.
Approved by marko (http://rb.no.oracle.com/rb/r/1891)
With innodb_change_buffering enabled, Innodb buffers
all modifications to secondary index leaf pages when
the leaf pages are not in buffer pool.
Crash InnoDB while an IBUF_OP_DELETE is being applied.
Restart and note that the same record can be applied
again which may lead to crash.
Mark the change buffer record processed, so that it will
not be merged again in case the server crashes between
the following mtr_commit() and the subsequent mtr_commit()
of deleting the change buffer record.
Testcase: No testcase because it is difficult to get the
timing right with the two asyncronous task purge and change
buffering
Approved by Marko. rb#1893
Get rid of O(n^2) scan in dyn array (mtr->memo) operations, accessing
the dyn array blocks directly.
dyn_array_get_last_block(), dyn_array_get_next_block(),
dyn_array_get_prev_block(): Define as a constness-preserving macro.
Add const qualifiers to many dyn_array functions.
mtr_memo_slot_release_func(): Renamed from mtr_memo_slot_release():
Make mtr_t* a debug-only parameter. Assume that slot->object != NULL.
mtr_memo_pop_all(): Access the dyn_array blocks directly, replacing
O(n^2) operation with O(n).
mtr_memo_release(): Access the dyn_array blocks directly, replacing
O(n^2) operation with O(n). This caused the performance problem.
rb#1540 approved by Jimmy Yang
WITH AN ASSERTION
Recently we added check to handle kill query signal for long operating
queries.
While the query interruption is reported it must to ensure cursor is restore
to proper state for HANDLER interface to work correctly.
Normal select query will not face this problem, as on recieving interrupt,
select query is aborted and new select query result in re-initialization
(including cursor).
rb://1836. Approved by Marko.
Problem:
During the index intersect access method, the SQL layer will access one row,
that satisfies a set of conditions, using an index i1. And then it will try to
access the same row, with other set of conditions using the next index i2. If
the fetch from i2 fails (we are talking about an error situation here and not
simply an unmatched row situation), then it will unlock the row accessed via
i1. This will work in all situations except deadlock error.
When a deadlock happens, InnoDB will rollback the transaction. InnoDB intimates
the SQL layer about this through the THD::transaction_rollback_request member.
But this is not currently used by the SQL layer.
Solution:
When an error happens, the SQL layer must check the
THD::transaction_rollback_request member, before calling handler::unlock_row().
We have also added a debug assert in ha_innobase::unlock_row() checking that
it must be called only when the transaction is in active state.
rb#1773 approved by Marko and Sunny.
HANG
Problem Statement:
When the operation RENAME TABLE is about rename the tablespace of the
table, it will stop all i/o operations on the tablespace temporarily.
For this the fil_space_t::stop_ios member is used.
Once the fil_space_t::stop_ios member is set to TRUE in the RENAME
TABLE operation, it is expected that no new i/o operation will be done
on the tablespace and all pending i/o operation can be completed on
the tablespace.
If the pending i/o operations initiate any new i/o operations then
there will be deadlock. The RENAME TABLE operation will be waiting
for pending i/o on the tablespace to be completed, and the pending i/o
operations will be waiting on the RENAME TABLE operation to set the
file_space_t::stop_ios flag to be set to FALSE.
But in the given scenario the pending i/o operations did not initiate
new i/o. But they where still unnecessarily checking the
fil_space_t::stop_ios flag. This resulted in deadlock.
Solution:
I noticed that this deadlock happens in fil_space_get_size() and
fil_space_get_zip_size() in the i/o threads. These functions check
the stop_ios flag even when no i/o will be initiated. I modified
these functions to ensure that they check the stop_ios flag only when
they will be initiating an i/o operation. This solves the problem.
rb://1635 (mysql-5.5)
rb://1660 (mysql-trunk) approved by Inaam, Jimmy, and ima.