Followup from 5.5 patch. Removing memory barriers on intel is wrong as
this doesn't prevent the compiler and/or processor from reorganizing reads
before the mutex release. Forcing a memory barrier before reading the waiters will
guarantee that no speculative reading takes place.
Fix memory barrier issues on releasing mutexes. We must have a full
memory barrier between releasing a mutex lock and reading its waiters.
This prevents us from missing to release waiters due to reading the
number of waiters speculatively before releasing the lock. If threads
try and wait between us reading the waiters count and releasing the
lock, those threads might stall indefinitely.
Also, we must use proper ACQUIRE/RELEASE semantics for atomic
operations, not ACQUIRE/ACQUIRE.
Provided IBM System Z have outdated compiler version, which supports gcc sync
builtins but not gcc atomic builtins. It also has weak memory model.
InnoDB attempted to verify if __sync_lock_test_and_set() is available by
checking IB_STRONG_MEMORY_MODEL. This macro has nothing to do with availability
of __sync_lock_test_and_set(), the right one is HAVE_ATOMIC_BUILTINS.
The root cause is that x86 has a stronger memory model than the ARM
processors. And the GCC builtins didn't issue the correct fences when
setting/unsetting the lock word. In particular during the mutex release.
The solution is rewriting atomic TAS operations: replace '__sync_' by
'__atomic_' if possible.
Reviewed-by: Sunny Bains <sunny.bains@oracle.com>
Reviewed-by: Bin Su <bin.x.su@oracle.com>
Reviewed-by: Debarun Banerjee <debarun.banerjee@oracle.com>
Reviewed-by: Krunal Bauskar <krunal.bauskar@oracle.com>
RB: 9782
RB: 9665
RB: 9783
This is an addendum to the fix for MDEV-7026. The ARM memory model is
similar to that of PowerPC and thus needs the same semantics with
respect to memory barriers. That is, os_atomic_test_and_set_*_release()
must be a store with a release barrier followed by a full
barrier. Unlike x86 using __sync_lock_test_and_set() which is
implemented as “exclusive load with acquire barriers + exclusive store”
is insufficient in contexts where os_atomic_test_and_set_*_release()
macros are used.
The bug was that full memory barrier was missing in the code that ensures that
a waiter on an InnoDB mutex will not go to sleep unless it is guaranteed to be
woken up again by another thread currently holding the mutex. This made
possible a race where a thread could get stuck waiting for a mutex that is in
fact no longer locked. If that thread was also holding other critical locks,
this could stall the entire server. There is an error monitor thread than can
break the stall, it runs about once per second. But if the error monitor
thread itself got stuck or was not running, then the entire server could hang
infinitely.
This was introduced on i386/amd64 platforms in 5.5.40 and 10.0.13 by an
incorrect patch that tried to fix the similar problem for PowerPC.
This commit reverts the incorrect PowerPC patch, and instead implements a fix
for PowerPC that does not change i386/amd64 behaviour, making PowerPC work
similarly to i386/amd64.
MDEV-6450 - MariaDB crash on Power8 when built with advance tool
chain
InnoDB mutex_exit() function calls __sync_test_and_set() to release
the lock. According to manual this function is supposed to create
"acquire" memory barrier whereas in fact we need "release" memory
barrier at mutex_exit().
The problem isn't repeatable with gcc because it creates
"acquire-release" memory barrier for __sync_test_and_set().
ATC creates just "acquire" barrier.
Fixed by creating proper barrier at mutex_exit() by using
__sync_lock_release() instead of __sync_test_and_set().
Part of this work is based on Stewart Smitch's memory barrier and lower priori
patches for power8.
- Added memory syncronization for innodb & xtradb for power8.
- Added HAVE_WINDOWS_MM_FENCE to CMakeList.txt
- Added os_isync to fix a syncronization problem on power
- Added log_get_lsn_nowait which is now used srv_error_monitor_thread to ensur
if log mutex is locked.
All changes done both for InnoDB and Xtradb
tool chain
This is an addition to the original patch. On Windows
InterlockedExchange implies full memory barrier, whereas
only acquire/release barriers required.
chain
InnoDB mutex_exit() function calls __sync_test_and_set() to release
the lock. According to manual this function is supposed to create
"acquire" memory barrier whereas in fact we need "release" memory
barrier at mutex_exit().
The problem isn't repeatable with gcc because it creates
"acquire-release" memory barrier for __sync_test_and_set().
ATC creates just "acquire" barrier.
Fixed by creating proper barrier at mutex_exit() by using
__sync_lock_release() instead of __sync_test_and_set().
On shutdown, do not exit threads in os_event_wait(). This method of
exiting was only used by the I/O handler threads. Exit them on a
higher level.
os_event_wait_low(), os_event_wait_time_low(): Do not exit on shutdown.
os_thread_exit(), ut_dbg_assertion_failed(), ut_print_timestamp(): Add
attribute cold, so that GCC knows that these functions are rarely
invoked and can be optimized for size.
os_aio_linux_collect(): Return on shutdown.
os_aio_linux_handle(), os_aio_simulated_handle(), os_aio_windows_handle():
Set *message1 = *message2 = NULL and return TRUE on shutdown.
fil_aio_wait(): Return on shutdown.
logs_empty_and_mark_files_at_shutdown(): Even in very fast shutdown
(innodb_fast_shutdown=2), allow the background threads to exit, but
skip the flushing and log checkpointing.
innobase_shutdown_for_mysql(): Always wait for all the threads to exit.
rb:633 approved by Sunny Bains
On shutdown, do not exit threads in os_event_wait(). This method of
exiting was only used by the I/O handler threads. Exit them on a
higher level.
os_event_wait_low(), os_event_wait_time_low(): Do not exit on shutdown.
os_thread_exit(), ut_dbg_assertion_failed(), ut_print_timestamp(): Add
attribute cold, so that GCC knows that these functions are rarely
invoked and can be optimized for size.
os_aio_linux_collect(): Return on shutdown.
os_aio_linux_handle(), os_aio_simulated_handle(), os_aio_windows_handle():
Set *message1 = *message2 = NULL and return TRUE on shutdown.
fil_aio_wait(): Return on shutdown.
logs_empty_and_mark_files_at_shutdown(): Even in very fast shutdown
(innodb_fast_shutdown=2), allow the background threads to exit, but
skip the flushing and log checkpointing.
innobase_shutdown_for_mysql(): Always wait for all the threads to exit.
rb:633 approved by Sunny Bains
Add new function os_cond_wait_timed(). Change the os_thread_sleep() calls
to timed conditional waits. Signal the background threads during the shutdown
phase so that we avoid waiting for the sleep to timeout thus saving some time.
rb://439 -- Approved by Jimmy Yang
Add new function os_cond_wait_timed(). Change the os_thread_sleep() calls
to timed conditional waits. Signal the background threads during the shutdown
phase so that we avoid waiting for the sleep to timeout thus saving some time.
rb://439 -- Approved by Jimmy Yang
This is a manual merge from mysql-5.1-innodb of:
revision-id: vasil.dimov@oracle.com-20100930102618-s9f9ytbytr3eqw9h
committer: Vasil Dimov <vasil.dimov@oracle.com>
timestamp: Thu 2010-09-30 13:26:18 +0300
message:
Fix a potential bug when using __sync_lock_test_and_set()
TYPE __sync_lock_test_and_set (TYPE *ptr, TYPE value, ...)
it is not documented what happens if the two arguments are of different
type like it was before: the first one was lock_word_t (byte) and the
second one was 1 or 0 (int).
Approved by: Marko (via IRC)
This is a manual merge from mysql-5.1-innodb of:
revision-id: vasil.dimov@oracle.com-20100930102618-s9f9ytbytr3eqw9h
committer: Vasil Dimov <vasil.dimov@oracle.com>
timestamp: Thu 2010-09-30 13:26:18 +0300
message:
Fix a potential bug when using __sync_lock_test_and_set()
TYPE __sync_lock_test_and_set (TYPE *ptr, TYPE value, ...)
it is not documented what happens if the two arguments are of different
type like it was before: the first one was lock_word_t (byte) and the
second one was 1 or 0 (int).
Approved by: Marko (via IRC)
This patch was originally developed by Vladislav Vaintroub.
The main changes are:
* Use TryEnterCriticalSection in os_fast_mutex_trylock().
* Use lightweight condition variables on Vista or later Windows;
but fall back to events on older Windows, such as XP.
This patch also fixes the following bugs:
bug# 52102 InnoDB Plugin shows performance drop compared to InnoDB
on Windows
bug# 53204 os_fastmutex_trylock is implemented incorrectly on Windows
rb://363 approved by Inaam Rana
This patch was originally developed by Vladislav Vaintroub.
The main changes are:
* Use TryEnterCriticalSection in os_fast_mutex_trylock().
* Use lightweight condition variables on Vista or later Windows;
but fall back to events on older Windows, such as XP.
This patch also fixes the following bugs:
bug# 52102 InnoDB Plugin shows performance drop compared to InnoDB
on Windows
bug# 53204 os_fastmutex_trylock is implemented incorrectly on Windows
rb://363 approved by Inaam Rana
layout as we always had in trees containing only the builtin
2) win\configure.js WITH_INNOBASE_STORAGE_ENGINE still works.
storage/innobase/CMakeLists.txt:
fix to new directory name (and like 5.1)
storage/innobase/Makefile.am:
fix to new directory name (and like 5.1)
storage/innobase/handler/ha_innodb.cc:
fix to new directory name (and like 5.1)
storage/innobase/plug.in:
fix to new directory name (and like 5.1)
bzr branch mysql-5.1-performance-version mysql-trunk # Summit
cd mysql-trunk
bzr merge mysql-5.1-innodb_plugin # which is 5.1 + Innodb plugin
bzr rm innobase # remove the builtin
Next step: build, test fixes.
bzr branch mysql-5.1-performance-version mysql-trunk # Summit
cd mysql-trunk
bzr merge mysql-5.1-innodb_plugin # which is 5.1 + Innodb plugin
bzr rm innobase # remove the builtin
Next step: build, test fixes.