Followup from 5.5 patch. Removing memory barriers on intel is wrong as
this doesn't prevent the compiler and/or processor from reorganizing reads
before the mutex release. Forcing a memory barrier before reading the waiters will
guarantee that no speculative reading takes place.
Fix memory barrier issues on releasing mutexes. We must have a full
memory barrier between releasing a mutex lock and reading its waiters.
This prevents us from missing to release waiters due to reading the
number of waiters speculatively before releasing the lock. If threads
try and wait between us reading the waiters count and releasing the
lock, those threads might stall indefinitely.
Also, we must use proper ACQUIRE/RELEASE semantics for atomic
operations, not ACQUIRE/ACQUIRE.
Provided IBM System Z have outdated compiler version, which supports gcc sync
builtins but not gcc atomic builtins. It also has weak memory model.
InnoDB attempted to verify if __sync_lock_test_and_set() is available by
checking IB_STRONG_MEMORY_MODEL. This macro has nothing to do with availability
of __sync_lock_test_and_set(), the right one is HAVE_ATOMIC_BUILTINS.
This is an addendum to the fix for MDEV-7026. The ARM memory model is
similar to that of PowerPC and thus needs the same semantics with
respect to memory barriers. That is, os_atomic_test_and_set_*_release()
must be a store with a release barrier followed by a full
barrier. Unlike x86 using __sync_lock_test_and_set() which is
implemented as “exclusive load with acquire barriers + exclusive store”
is insufficient in contexts where os_atomic_test_and_set_*_release()
macros are used.
The bug was that full memory barrier was missing in the code that ensures that
a waiter on an InnoDB mutex will not go to sleep unless it is guaranteed to be
woken up again by another thread currently holding the mutex. This made
possible a race where a thread could get stuck waiting for a mutex that is in
fact no longer locked. If that thread was also holding other critical locks,
this could stall the entire server. There is an error monitor thread than can
break the stall, it runs about once per second. But if the error monitor
thread itself got stuck or was not running, then the entire server could hang
infinitely.
This was introduced on i386/amd64 platforms in 5.5.40 and 10.0.13 by an
incorrect patch that tried to fix the similar problem for PowerPC.
This commit reverts the incorrect PowerPC patch, and instead implements a fix
for PowerPC that does not change i386/amd64 behaviour, making PowerPC work
similarly to i386/amd64.
MDEV-6450 - MariaDB crash on Power8 when built with advance tool
chain
InnoDB mutex_exit() function calls __sync_test_and_set() to release
the lock. According to manual this function is supposed to create
"acquire" memory barrier whereas in fact we need "release" memory
barrier at mutex_exit().
The problem isn't repeatable with gcc because it creates
"acquire-release" memory barrier for __sync_test_and_set().
ATC creates just "acquire" barrier.
Fixed by creating proper barrier at mutex_exit() by using
__sync_lock_release() instead of __sync_test_and_set().
Part of this work is based on Stewart Smitch's memory barrier and lower priori
patches for power8.
- Added memory syncronization for innodb & xtradb for power8.
- Added HAVE_WINDOWS_MM_FENCE to CMakeList.txt
- Added os_isync to fix a syncronization problem on power
- Added log_get_lsn_nowait which is now used srv_error_monitor_thread to ensur
if log mutex is locked.
All changes done both for InnoDB and Xtradb
tool chain
This is an addition to the original patch. On Windows
InterlockedExchange implies full memory barrier, whereas
only acquire/release barriers required.
chain
InnoDB mutex_exit() function calls __sync_test_and_set() to release
the lock. According to manual this function is supposed to create
"acquire" memory barrier whereas in fact we need "release" memory
barrier at mutex_exit().
The problem isn't repeatable with gcc because it creates
"acquire-release" memory barrier for __sync_test_and_set().
ATC creates just "acquire" barrier.
Fixed by creating proper barrier at mutex_exit() by using
__sync_lock_release() instead of __sync_test_and_set().
support ha_innodb.so as a dynamic plugin.
* remove obsolete *,innodb_plugin.rdiff files
* s/--plugin-load=/--plugin-load-add=/
* MYSQL_PLUGIN_IMPORT glob_hostname[]
* use my_error instead of push_warning_printf(ER_DEFAULT)
* don't use tdc_size and tc_size in a module
update test cases (XtraDB is 5.6.14, InnoDB is 5.6.10)
* copy new tests over
* disable some tests for (old) InnoDB
* delete XtraDB tests that no longer apply
small compatibility changes:
* s/HTON_EXTENDED_KEYS/HTON_SUPPORTS_EXTENDED_KEYS/
* revert unnecessary InnoDB changes to make it a bit closer to the upstream
fix XtraDB to compile on Windows (both as a static and a dynamic plugin)
disable XtraDB on Windows (deadlocks) and where no atomic ops are available (e.g. CentOS 5)
storage/innobase/handler/ha_innodb.cc:
revert few unnecessary changes to make it a bit closer to the original InnoDB
storage/innobase/include/univ.i:
correct the version to match what it was merged from
Implement os_event_wait_time() for POSIX systems.
In the purge thread, use os_event_wait_time() when sleeping rather than sleep,
and signal the event when server shuts down, so we do not need to wait for
upto 10 seconds until the purge thread wakes up.
Also fix bug that warnings that were pushed after we call set_ok_status() were
not included in the waning count sent to the client in the result packet.
Also in mysqltest, in recursive die() invocation at least print the message
before aborting.
client/mysqltest.cc:
If we detect recursive die(), at least print the message before aborting.
mysql-test/r/warnings_debug.result:
Test case.
mysql-test/t/warnings_debug.test:
Test case.
sql/handler.cc:
Force generation of a warning with specific debug option, for testing.
sql/protocol.cc:
Fix wrong DBUG_ENTER
sql/sql_class.h:
Add method to count warnings pushed after set_ok_status() is called.
sql/sql_error.cc:
Also count warnings pushed after set_ok_status() is called.
storage/xtradb/include/os0sync.h:
Implement working os_sync_wait_time() for POSIX.
storage/xtradb/include/srv0srv.h:
Make the purge thread wait for an event when sleeping, so we can signal it to
wakeup immediately at shutdown.
storage/xtradb/log/log0log.c:
Make the purge thread wait for an event when sleeping, so we can signal it to
wakeup immediately at shutdown.
storage/xtradb/os/os0sync.c:
Implement working os_sync_wait_time() for POSIX.
storage/xtradb/srv/srv0srv.c:
Make the purge thread wait for an event when sleeping, so we can signal it to
wakeup immediately at shutdown.