Merge Facebook commit 25295d003cb0c17aa8fb756523923c77250b3294
authored by Steaphan Greene from https://github.com/facebook/mysql-5.6
This adds a pointer to the trx to each mtr.
This allows the trx to be accessed in parts of the code
where it was otherwise not available. This is needed later.
Merged Facebooks commit 6e06bbfa315ffb97d713dd6e672d6054036ddc21
authored by Inaam Rana from https://github.com/facebook/mysql-5.6.
Fixes MySQL bug http://bugs.mysql.com/bug.php?id=72123
lock_timeout thread works in a tight loop waking up every second
and checking for lock_wait_timeout. In addition, when a mysql
thread is forced to wait on a lock, it signals the lock_timeout thread
as well. This call is not required. In a heavily contended workload
each thread going to wait will signal the lock_timeout thread making
it work all the time. As lock_timeout thread scans the array of
waiting threads under lock_sys::wait_mutex which is already very
hot in contneded loads, these extra scans can cause significanct
performance regression.
Also, in various codepaths lock_timeout thread is signalled where
actual intention was to signal the innodb monitor thread.
THE PERFORMANCE UNDER HEAVY INSERT
Problem:
There are three memset call to allocate memory for system fields
in each insert.
Solution:
Instead of calling it in 3 times, we can combine it into
one memset call. It will reduce the CPU usage under heavy insert.
Approved by Marko rb-4916
Problem:
In the clustered index, when an update operation is done the overall
scenario (after rb#4479) is as follows:
1. Delete mark the old record that is to be updated.
2. The old record disowns the blobs.
3. Insert the new record into clustered index.
4. For non-updated blobs, new record must own it. Verified by assert.
5. For non-updated blobs, in new record marked as inherited.
Scenario involving DB_LOCK_WAIT:
If step 3 times out, then we will skip 1 and 2 and will continue from
step 3. This skipping is achieved by the UPD_NODE_INSERT_BLOB state.
In this case, step 4 is not correct. Because of step 1, the new
record need not own the blobs. Hence the assert failure.
Solution:
The assert in step 4 is removed. Instead code is added to ensure that
the record owns the blob.
Note:
This is a regression caused by rb#4479.
rb#4571 approved by Marko
Syntax. Server support. Test cases.
InnoDB bugfixes:
* don't mess around with system sprintf's, always use my_error() for errors.
* don't use InnoDB internal error codes where OS error codes are expected.
* don't say "file not found", when it was.
Update InnoDB to 5.6.14
Apply MySQL-5.6 hack for MySQL Bug#16434374
Move Aria-only HA_RTREE_INDEX from my_base.h to maria_def.h (breaks an assert in InnoDB)
Fix InnoDB memory leak
Problem:
The function row_upd_changes_ord_field_binary() is used to decide whether to
use row_upd_clust_rec_by_insert() or row_upd_clust_rec(). The function
row_upd_changes_ord_field_binary() does not make use of charset information.
Based on binary comparison it decides that r1 and r2 differ in their ordering
fields.
In the function row_upd_clust_rec_by_insert(), an update is done by delete +
insert. These operations internally make use of cmp_dtuple_rec_with_match()
to compare records r1 and r2. This comparison takes place with the use of
charset information.
This means that it is possible for the deleted record to be reused in the
subsequent insert. In the given scenario, the characters 'a' and 'A' are
considered equal in the my_charset_latin1. When this happens, the ownership
information of externally stored blobs are not correctly handled.
Solution:
When an update is done by delete followed by insert, disown the relevant
externally stored fields during the delete marking itself (within the same
mtr). If the insert succeeds, then nothing with respect to blob ownership
needs to be done. If the insert fails, then the disown done earlier will be
removed when the operation is rolled back.
rb#4479 approved by Marko.
Also known as MySQL#70047 and BUG#17316314 (srv_buf_size not declared).
The workaround is taken from MySQL 5.6 tree:
BUG#17316314 - SRV_BUF_SIZE NOT DECLARED
Temporary fix. Disabling FALLOC_FL_PUNCH_HOLE for now
Regression from bug#14621190 due to disabled optimistic restoration
of cursor, which required full key lookup instead of verifying
if previously positioned btree cursor could be reused.
Fixed by enable optimistic restore and adjust cursor afterward.
rb#3324 approved by Marko.
Analysis: After ALTER TABLE the table statistics needs to be rebuilt and therefore stat_initialized is set false. It will be rebuilt when the table is loaded again and table is closed when alter table is completed. However, during alter table table could be used by concurrent SELECT from I_S. Therefore, we need to rebuild transient table statistics meanwhile until table can be reloaded.
INDEX_READ_MAP HAD NO MATCH
If index_read_map is called for exact search and no matching records
exists it will position the cursor on the next record, but still having the
relative position to BTR_PCUR_ON.
This will make a call for index_next to read yet another next record,
instead of returning the record the cursor points to.
Fixed by setting pcur->rel_pos = BTR_PCUR_BEFORE if an exact
[prefix] search is done, but failed.
Also avoids optimistic restoration if rel_pos != BTR_PCUR_ON,
since btr_cur may be different than old_rec.
rb#3324, approved by Marko and Jimmy
DICT_TABLE_GET_FORMAT(CLUST_INDEX->TABLE) >= 1
The function row_sel_sec_rec_is_for_clust_rec() was incorrectly
preparing to compare a NULL column prefix in a secondary index with a
non-NULL column in a clustered index.
This can trigger an assertion failure in 5.1 plugin and later. In the
built-in InnoDB of MySQL 5.1 and earlier, we would apparently only do
some extra work, by trimming the clustered index field for the
comparison.
The code might actually have worked properly apart from this debug
assertion failure. It is merely doing some extra work in fetching a
BLOB column, and then comparing it to NULL (which would return the
same result, no matter what the BLOB contents is).
While the test case involves CHECK TABLE, this could theoretically
occur during any read that uses a secondary index on a column prefix
of a column that can be NULL.
rb#3101 approved by Mattias Jonsson
There was a race condition in the rollback of TRX_UNDO_UPD_DEL_REC.
Once row_undo_mod_clust() has rolled back the changes by the rolling-back
transaction, it attempts to purge the delete-marked record, if possible, in a
separate mini-transaction.
However, row_undo_mod_remove_clust_low() fails to check if the DB_TRX_ID of
the record that it found after repositioning the cursor, is still the same.
If it is not, it means that the record was purged and another record was
inserted in its place.
So, the rollback would have performed an incorrect purge, breaking the
locking rules and causing corruption.
The problem was found by creating a table that contains a unique
secondary index and a primary key, and two threads running REPLACE
with only one value for the unique column, so that the uniqueness
constraint would be violated all the time, leading to statement
rollback.
This bug exists in all InnoDB versions (I checked MySQL 3.23.53).
It has become easier to repeat in 5.5 and 5.6 thanks to scalability
improvements and a dedicated purge thread.
rb#3085 approved by Jimmy Yang
Problem:
When the user specified foreign key name contains "_ibfk_", InnoDB wrongly
tries to rename it.
Solution:
When a table is renamed, all its associated foreign keys will also be renamed,
only if the foreign key names are automatically generated. If the foreign key
names are given by the user, even if it has _ibfk_ in it, it must not be
renamed.
rb#2935 approved by Jimmy, Krunal and Satya
includes:
* remove some remnants of "Bug#14521864: MYSQL 5.1 TO 5.5 BUGS PARTITIONING"
* introduce LOCK_share, now LOCK_ha_data is strictly for engines
* rea_create_table() always creates .par file (even in "frm-only" mode)
* fix a 5.6 bug, temp file leak on dummy ALTER TABLE
Bug #16754901 PARS_INFO_FREE NOT CALLED IN DICT_CREATE_ADD_FOREIGN_TO_DICTIONARY
Problem:
There are two situations here. The constraint name is explicitly
given by the user and the constraint name is automatically generated
by InnoDB. In the case of generated constraint name, it is formed by
adding table name as prefix. The table names are stored internally in
my_charset_filename. In the case of constraint name explicitly given
by the user, it is stored in UTF8 format itself. So, in some
situations the constraint name is in utf8 and in some situations it is
in my_charset_filename format. Hence this problem.
Solution:
Always store the foreign key constraint name in UTF-8 even when
automatically generated.
Bug #16754901 PARS_INFO_FREE NOT CALLED IN DICT_CREATE_ADD_FOREIGN_TO_DICTIONARY
Problem:
There was a memory leak in the function dict_create_add_foreign_to_dictionary().
The allocated pars_info_t object is not freed in the error code path.
Solution:
Allocate the pars_info_t object after the error checking.
rb#2368 in review
UPDATES
After checking that the table has changed too much in
row_update_statistics_if_needed() and calling dict_update_statistics(),
also check if the same condition holds after acquiring the table stats
latch. This is to avoid multiple threads concurrently entering and
executing the stats update code.
Approved by: Marko (rb:2186)
INSERT WITH SAME VALUES
Problem:
When a transaction is in READ COMMITTED isolation level, gap locks are still
taken in the secondary index, when row is inserted. This happens when the
secondary index is scanned for duplicate.
The function row_ins_scan_sec_index_for_duplicate() always calls the
function row_ins_set_shared_rec_lock() with LOCK_ORDINARY irrespective of
the transaction isolation level.
Solution:
The function row_ins_scan_sec_index_for_duplicate() calls the
function row_ins_set_shared_rec_lock() with LOCK_ORDINARY or
LOCK_REC_NOT_GAP based on the transaction isolation level.
rb://2035 approved by Krunal and Marko
This is a deadlock that will also be fixed in the server by
Bug #11844915 - HANG IN THDVAR MUTEX ACQUISITION.
So this is a simple alternate method of fixing the same problem,
but from within InnoDB.
The simple change is to make rename table start a transaction
before locking dict_sys->mutex since thd_supports_xa() can call
THDVAR which can lock a mutex, LOCK_global_system_variables, that
is used in the server by many other activities. At least one of
those, sys_var::update(), can call back into InnoDB and try to
lock dict_sys->mutex while holding LOCK_global_system_variables.
The other bug fix for 11844915 eliminates the use of
LOCK_global_system_variables for calls to THDVAR.
Approved by marko in http://rb.no.oracle.com/rb/r/2000/