Moved gcc specific code to gcc_builtins.h.
Moved intptr into the black magic code block.
Moved definition of atomic operations for "long" out of black magic code block.
Clean-up nolock.h: it doesn't serve any purpose anymore. Appropriate code moved
to x86-gcc.h and my_atomic.h.
If gcc sync bultins were detected, we want to make use of them independently of
__GNUC__ definition. E.g. XLC simulates those, but doesn't define __GNUC__.
HS/Spider: According to AIX manual alloca() returns char*, which cannot be
casted to any type with static_cast. Use explicit cast instead.
MDL: Removed namemangling pragma, which didn't let MariaDB build with XLC.
WSREP: _int64 seem to be conflicting name with XLC, replaced with _integer64.
CONNECT: RTLD_NOLOAD is GNU extention. Removed rather meaningless check if
library is loaded. Multiple dlopen()'s of the same library are permitted,
and it never gets closed anyway. Except for error, which was a bug: it may
close library, which can still be referenced by other subsystems.
InnoDB: __ppc_get_timebase() is GNU extention. Only use it when __GLIBC__ is
defined.
Based on contribution by flynn1973.
in cmake tests let's treat clang like gcc (same options,
same builtins) in many cases.
* don't check the compiler when
* testing for -fvisibility=hidden support
* testing for HAVE_ABI_CXA_DEMANGLE
* testing for HAVE_GCC_ATOMIC_BUILTINS
* when removing options with string(replace)
* when running ${CC} --version (ignore the error instead)
* run ABI checks for clang
* use "canonical" gcc flags for clang
* fix groonga too
Also:
* add cmake detection for gcc __atomic_* builtins. they might be
supported (__ATOMIC_SEQ_CST is defined), but not for all operand
sizes. In particular, 64-bit atomic load is problematic on i386
* cache check results for Windows
* remove the test for HAVE_CXXABI_H (HAVE_ABI_CXA_DEMANGLE is
suffifient)
my_atomic_load() is implemented as __sync_fetch_and_or(var, 0) which
writes or-ed value back to var. Memory writes as such have worse
performance and scalability than reads.
gcc 4.7 and up offers better facility for atomic loads/stores. Use it
whenever it is available.
The problem was due to a misuse of GCC asm constraints used to
implement a atomic load. On x86_64, the load was implemented
as a cmpxchg which implicitly uses the eax register as a
source and destination operand, yet the dummy value used for
comparison wasn't being properly loaded into eax (and other
problems).
The core problem is that cmpxchg is unnecessary as a load
on x86_64 as there are other simpler instructions such
as xadd. Even though, such instructions are only used to
have a memory barrier as load and stores are atomic by
definition. Hence, the solution is to explicitly issue the
required CPU and compiler barriers.
include/atomic/x86-gcc.h:
Issue a synchronizing instruction before loading the value.
Afterwards, issue a compiler barrier to prevent reordering.
The problem was that the x86 assembly based atomic CAS
(compare and swap) implementation could copy the wrong
value to the ebx register, where the cmpxchg8b expects
to see part of the "comparand" value. Since the original
value in the ebx register is saved in the stack (that is,
the push instruction causes the stack pointer to change),
a wrong offset could be used if the compiler decides to
put the source of the comparand value in the stack.
The solution is to copy the comparand value directly from
memory. Since the comparand value is 64-bits wide, it is
copied in two steps over to the ebx and ecx registers.
include/atomic/x86-gcc.h:
For reference, an excerpt from a faulty binary follows.
It is a disassembly of my_atomic-t, compiled at -O3 with
ICC 11.0. Most of the code deals with preparations for
a atomic cmpxchg8b operation. This instruction compares
the value in edx:eax with the destination operand. If the
values are equal, the value in ecx:ebx is stored in the
destination, otherwise the value in the destination operand
is copied into edx:eax.
In this case, my_atomic_add64 is implemented as a compare
and exchange. The addition is done over temporary storage
and loaded into the destination if the original term value
is still valid.
volatile int64 a64;
int64 b=0x1000200030004000LL;
a64=0;
mov 0xfffffda8(%ebx),%eax
xor %ebp,%ebp
mov %ebp,(%eax)
mov %ebp,0x4(%eax)
my_atomic_add64(&a64, b);
mov 0xfffffda8(%ebx),%ebp # Load address of a64
mov 0x0(%ebp),%edx # Copy value
mov 0x4(%ebp),%ecx
mov %edx,0xc(%esp) # Assign to tmp var in the stack
mov %ecx,0x10(%esp)
add $0x30004000,%edx # Sum values
adc $0x10002000,%ecx
mov %edx,0x8(%esp) # Save part of result for later
mov 0x0(%ebp),%esi # Copy value of a64 again
mov 0x4(%ebp),%edi
mov 0xc(%esp),%eax # Load the value of a64 used
mov 0x10(%esp),%edx # for comparison
mov %esi,(%esp)
mov %edi,0x4(%esp)
push %ebx # Push %ebx into stack. Changes esp.
mov 0x8(%esp),%ebx # Wrong restore of the result.
lock cmpxchg8b 0x0(%ebp)
sete %cl
pop %ebx
Bug#52261: 64 bit atomic operations do not work on Solaris i386
gcc in debug compilation
One of the various problems was that the source operand to
CMPXCHG8b was marked as a input/output operand, causing GCC
to use the EBX register as the destination register for the
CMPXCHG8b instruction. This could lead to crashes as the EBX
register is also implicitly used by the instruction, causing
the value to be potentially garbaged and a protection fault
once the value is used to access a position in memory.
Another problem was the lack of proper clobbers for the atomic
operations and, also, a discrepancy between the implementations
for the Compare and Set operation. The specific problems are
described and fixed by Kristian Nielsen patches:
Patch: 1
Fix bugs in my_atomic_cas*(val,cmp,new) that *cmp is accessed
after CAS succeds.
In the gcc builtin implementation, problem was that *cmp was
read again after atomic CAS to check if old *val == *cmp;
this fails if CAS is successful and another thread modifies
*cmp in-between.
In the x86-gcc implementation, problem was that *cmp was set
also in the case of successful CAS; this means there is a
window where it can clobber a value written by another thread
after successful CAS.
Patch 2:
Add a GCC asm "memory" clobber to primitives that imply a
memory barrier.
This signifies to GCC that any potentially aliased memory
must be flushed before the operation, and re-read after the
operation, so that read or modification in other threads of
such memory values will work as intended.
In effect, it makes these primitives work as memory barriers
for the compiler as well as the CPU. This is better and more
correct than adding "volatile" to variables.
include/atomic/gcc_builtins.h:
Do not read from *cmp after the operation as it might be
already gone if the operation was successful.
include/atomic/nolock.h:
Prefer system provided atomics over the broken x86 asm.
include/atomic/x86-gcc.h:
Do not mark source operands as input/output operands.
Add proper memory clobbers.
include/my_atomic.h:
Add notes about my_atomic_add and my_atomic_cas behaviors.
unittest/mysys/my_atomic-t.c:
Remove work around, if it fails, there is either a problem
with the atomic operations code or the specific compiler
version should be black-listed.
The atomic operations implementation on 5.1 has a few problems,
which might cause tests to abort randomly. Since no code in 5.1
uses atomic operations, simply remove the code.
Conflicts
=========
Text conflict in .bzr-mysql/default.conf
Text conflict in libmysqld/CMakeLists.txt
Text conflict in libmysqld/Makefile.am
Text conflict in mysql-test/collections/default.experimental
Text conflict in mysql-test/extra/rpl_tests/rpl_row_sp006.test
Text conflict in mysql-test/suite/binlog/r/binlog_tmp_table.result
Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result
Text conflict in mysql-test/suite/rpl/r/rpl_loaddata_fatal.result
Text conflict in mysql-test/suite/rpl/r/rpl_row_create_table.result
Text conflict in mysql-test/suite/rpl/r/rpl_row_sp006_InnoDB.result
Text conflict in mysql-test/suite/rpl/r/rpl_stm_log.result
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_circular_simplex.result
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_sp006.result
Text conflict in mysql-test/t/mysqlbinlog.test
Text conflict in sql/CMakeLists.txt
Text conflict in sql/Makefile.am
Text conflict in sql/log_event_old.cc
Text conflict in sql/rpl_rli.cc
Text conflict in sql/slave.cc
Text conflict in sql/sql_binlog.cc
Text conflict in sql/sql_lex.h
21 conflicts encountered.
NOTE
====
mysql-5.1-rpl-merge has been made a mirror of mysql-next-mr:
- "mysql-5.1-rpl-merge$ bzr pull ../mysql-next-mr"
This is the first cset (merge/...) committed after pulling
from mysql-next-mr.