MDEV-6483 - Deadlock around rw_lock_debug_mutex on PPC64
This problem affects only debug builds on PPC64.
There are at least two race conditions around
rw_lock_debug_mutex_enter and rw_lock_debug_mutex_exit:
- rw_lock_debug_waiters was loaded/stored without setting
appropriate locks/memory barriers.
- there is a gap between calls to os_event_reset() and
os_event_wait() and in such case we're supposed to pass
return value of the former to the latter.
Fixed by replacing self-cooked spinlocks with system mutexes.
These days system mutexes offer much better performance. OTOH
performance is not that critical for debug builds.
MDEV-6450 - MariaDB crash on Power8 when built with advance tool
chain
InnoDB mutex_exit() function calls __sync_test_and_set() to release
the lock. According to manual this function is supposed to create
"acquire" memory barrier whereas in fact we need "release" memory
barrier at mutex_exit().
The problem isn't repeatable with gcc because it creates
"acquire-release" memory barrier for __sync_test_and_set().
ATC creates just "acquire" barrier.
Fixed by creating proper barrier at mutex_exit() by using
__sync_lock_release() instead of __sync_test_and_set().
Part of this work is based on Stewart Smitch's memory barrier and lower priori
patches for power8.
- Added memory syncronization for innodb & xtradb for power8.
- Added HAVE_WINDOWS_MM_FENCE to CMakeList.txt
- Added os_isync to fix a syncronization problem on power
- Added log_get_lsn_nowait which is now used srv_error_monitor_thread to ensur
if log mutex is locked.
All changes done both for InnoDB and Xtradb
~40% bugfixed(*) applied
~40$ bugfixed reverted (incorrect or we're not buggy)
~20% bugfixed applied, despite us being not buggy
(*) only changes in the server code, e.g. not cmakefiles
SLOW/CRASHES SEMAPHORE
Problem:
There are 2 lakh tables - fk_000001, fk_000002 ... fk_200000. All of them
are related to the same parent_table through a foreign key constraint.
When the parent_table is loaded into the dictionary cache, all the child table
will also be loaded. This is taking lot of time. Since this operation happens
when the dictionary latch is taken, the scenario leads to "long semaphore wait"
situation and the server gets killed.
Analysis:
A simple performance analysis showed that the slowness is because of the
dict_foreign_find() function. It does a linear search on two linked list
table->foreign_list and table->referenced_list, looking for a particular
foreign key object based on foreign->id as the key. This is called two
times for each foreign key object.
Solution:
Introduce a rb tree in table->foreign_rbt and table->referenced_rbt, which
are some sort of index on table->foreign_list and table->referenced_list
respectively, using foreign->id as the key. These rbt structures will be
solely used by dict_foreign_find().
rb#5599 approved by Vasil
This is not yet a fix. This is change to print additional information at the point
when this assertion is going to happen. Print as much information about the pages
and index to find out why next page is not a compact format.
Regression from bug#14621190 due to disabled optimistic restoration
of cursor, which required full key lookup instead of verifying
if previously positioned btree cursor could be reused.
Fixed by enable optimistic restore and adjust cursor afterward.
rb#3324 approved by Marko.
--Implemented CHECK TABLE...QUICK.
Introduce CHECK TABLE...QUICK that would skip the btr_validate_index()
and btr_search_validate() call, and count the no. of records in each index.
Approved by Marko and Kevin. (rb#3567).
The testcase for this bug fails randomly due to two reasons.
1. Due to ibuf merge happening background
2. Due to dict stats update which brings the evicted page back into
buffer pool.
Fix ibuf_contract_ext() to not do any merges with ibuf_debug enabled and
also changed dict_stats_update() to return fake statistics without
bringing the secondary index pages into buffer pool.
Approved by Marko. rb#3419
IT IS DONE IN-PLACE
With change buffer enabled, InnoDB doesn't write a transaction log
record when it merges a record from the insert buffer to an secondary
index page if the insertion is performed as an update-in-place.
Fixed by logging the 'update-in-place' operation on secondary index
pages.
Approved by Marko. rb#2429
Bug #16754901 PARS_INFO_FREE NOT CALLED IN DICT_CREATE_ADD_FOREIGN_TO_DICTIONARY
Problem:
There are two situations here. The constraint name is explicitly
given by the user and the constraint name is automatically generated
by InnoDB. In the case of generated constraint name, it is formed by
adding table name as prefix. The table names are stored internally in
my_charset_filename. In the case of constraint name explicitly given
by the user, it is stored in UTF8 format itself. So, in some
situations the constraint name is in utf8 and in some situations it is
in my_charset_filename format. Hence this problem.
Solution:
Always store the foreign key constraint name in UTF-8 even when
automatically generated.
Bug #16754901 PARS_INFO_FREE NOT CALLED IN DICT_CREATE_ADD_FOREIGN_TO_DICTIONARY
Problem:
There was a memory leak in the function dict_create_add_foreign_to_dictionary().
The allocated pars_info_t object is not freed in the error code path.
Solution:
Allocate the pars_info_t object after the error checking.
rb#2368 in review
After a clean shutdown, InnoDB will not check the *.ibd file headers,
for maximum performance. This is unchanged before and after this
patch.
What this fix addresses is the case when crash recovery is
needed. Previously, InnoDB could load a corrupted tablespace file.
buf_page_is_corrupted(): Add the parameter check_lsn.
fil_check_first_page(): New function, to perform a consistency check
on the first page of a file. This can be overridden by setting
innodb_force_recovery.
fil_read_first_page(), fil_open_single_table_tablespace(),
fil_load_single_table_tablespace(): Invoke fil_check_first_page().
open_or_create_data_files(): Check the status of
fil_open_single_table_tablespace().
rb#2352 approved by Jimmy Yang
TABLE/KEY RELATIONS
The DICT_FK_MAX_RECURSIVE_LOAD was reduced from 250 to 33 in rb#2058.
But in optimized build, this recursive depth is still too deep and
resulted in stack overflow. So reducing this depth to 20 now.
TABLE/KEY RELATIONS
Problem:
When there are many tables, linked together through the foreign key
constraints, then loading one table will recursively open other tables. This
can sometimes lead to thread stack overflow. In such situations the server
will exit.
I see the stack overflow problem when the thread_stack is 196608 (the default
value for 32-bit systems). I don't see the problem when the thread_stack is
set to 262144 (the default value for 64-bit systems).
Solution:
Currently, in InnoDB, there is a macro DICT_FK_MAX_RECURSIVE_LOAD which defines
the maximum number of tables that will be loaded recursively because of foreign
key relations. This is currently set to 250. We can reduce this number to 33
(anything more than 33 does not solve the problem for the default value). We
can keep it small enough so that thread stack overflow does not happen for the
default values. Reducing the DICT_FK_MAX_RECURSIVE_LOAD will not affect the
functionality of InnoDB. The tables will eventually be loaded.
rb#2058 approved by Marko
TABLE/KEY RELATIONS
Problem:
When there are many tables, linked together through the foreign key
constraints, then loading one table will recursively open other tables. This
can sometimes lead to thread stack overflow. In such situations the server
will exit.
I see the stack overflow problem when the thread_stack is 196608 (the default
value for 32-bit systems). I don't see the problem when the thread_stack is
set to 262144 (the default value for 64-bit systems).
Solution:
Currently, in InnoDB, there is a macro DICT_FK_MAX_RECURSIVE_LOAD which defines
the maximum number of tables that will be loaded recursively because of foreign
key relations. This is currently set to 250. We can reduce this number to 33
(anything more than 33 does not solve the problem for the default value). We
can keep it small enough so that thread stack overflow does not happen for the
default values. Reducing the DICT_FK_MAX_RECURSIVE_LOAD will not affect the
functionality of InnoDB. The tables will eventually be loaded.
rb#2058 approved by Marko
UPDATES
After checking that the table has changed too much in
row_update_statistics_if_needed() and calling dict_update_statistics(),
also check if the same condition holds after acquiring the table stats
latch. This is to avoid multiple threads concurrently entering and
executing the stats update code.
Approved by: Marko (rb:2186)
FROM SHOW CREATE
Problem: The length of the internally generated foreign key name
is not checked.
Solution: The length of the internally generated foreign key name is
checked. If it is greater than the allowed limit, an error message
is reported. Also, the constraint name is printed in the same manner
as the table name, using the system charset information.
rb://1969 approved by Marko.
DEREFERENCING UT_DBG_NULL_PTR
The abort() call is standard C but InnoDB only uses it in GCC
environments. UT_DBG_USE_ABORT is not defined the code crashed
by dereferencing a null pointer instead of calling abort().
Other code throughout MySQL including ndb, sql, mysys and other
places call abort() directly.
This bug also affects innodb.innodb_bug14147491.test which fails
randomly on windows because of this issue.
Approved by marko in http://rb.no.oracle.com/rb/r/1936/
Get rid of O(n^2) scan in dyn array (mtr->memo) operations, accessing
the dyn array blocks directly.
dyn_array_get_last_block(), dyn_array_get_next_block(),
dyn_array_get_prev_block(): Define as a constness-preserving macro.
Add const qualifiers to many dyn_array functions.
mtr_memo_slot_release_func(): Renamed from mtr_memo_slot_release():
Make mtr_t* a debug-only parameter. Assume that slot->object != NULL.
mtr_memo_pop_all(): Access the dyn_array blocks directly, replacing
O(n^2) operation with O(n).
mtr_memo_release(): Access the dyn_array blocks directly, replacing
O(n^2) operation with O(n). This caused the performance problem.
rb#1540 approved by Jimmy Yang