The problem was due to a misuse of GCC asm constraints used to
implement a atomic load. On x86_64, the load was implemented
as a cmpxchg which implicitly uses the eax register as a
source and destination operand, yet the dummy value used for
comparison wasn't being properly loaded into eax (and other
problems).
The core problem is that cmpxchg is unnecessary as a load
on x86_64 as there are other simpler instructions such
as xadd. Even though, such instructions are only used to
have a memory barrier as load and stores are atomic by
definition. Hence, the solution is to explicitly issue the
required CPU and compiler barriers.
The problem was that the x86 assembly based atomic CAS
(compare and swap) implementation could copy the wrong
value to the ebx register, where the cmpxchg8b expects
to see part of the "comparand" value. Since the original
value in the ebx register is saved in the stack (that is,
the push instruction causes the stack pointer to change),
a wrong offset could be used if the compiler decides to
put the source of the comparand value in the stack.
The solution is to copy the comparand value directly from
memory. Since the comparand value is 64-bits wide, it is
copied in two steps over to the ebx and ecx registers.
Bug#52261: 64 bit atomic operations do not work on Solaris i386
gcc in debug compilation
One of the various problems was that the source operand to
CMPXCHG8b was marked as a input/output operand, causing GCC
to use the EBX register as the destination register for the
CMPXCHG8b instruction. This could lead to crashes as the EBX
register is also implicitly used by the instruction, causing
the value to be potentially garbaged and a protection fault
once the value is used to access a position in memory.
Another problem was the lack of proper clobbers for the atomic
operations and, also, a discrepancy between the implementations
for the Compare and Set operation. The specific problems are
described and fixed by Kristian Nielsen patches:
Patch: 1
Fix bugs in my_atomic_cas*(val,cmp,new) that *cmp is accessed
after CAS succeds.
In the gcc builtin implementation, problem was that *cmp was
read again after atomic CAS to check if old *val == *cmp;
this fails if CAS is successful and another thread modifies
*cmp in-between.
In the x86-gcc implementation, problem was that *cmp was set
also in the case of successful CAS; this means there is a
window where it can clobber a value written by another thread
after successful CAS.
Patch 2:
Add a GCC asm "memory" clobber to primitives that imply a
memory barrier.
This signifies to GCC that any potentially aliased memory
must be flushed before the operation, and re-read after the
operation, so that read or modification in other threads of
such memory values will work as intended.
In effect, it makes these primitives work as memory barriers
for the compiler as well as the CPU. This is better and more
correct than adding "volatile" to variables.
Conflicts
=========
Text conflict in .bzr-mysql/default.conf
Text conflict in libmysqld/CMakeLists.txt
Text conflict in libmysqld/Makefile.am
Text conflict in mysql-test/collections/default.experimental
Text conflict in mysql-test/extra/rpl_tests/rpl_row_sp006.test
Text conflict in mysql-test/suite/binlog/r/binlog_tmp_table.result
Text conflict in mysql-test/suite/rpl/r/rpl_loaddata.result
Text conflict in mysql-test/suite/rpl/r/rpl_loaddata_fatal.result
Text conflict in mysql-test/suite/rpl/r/rpl_row_create_table.result
Text conflict in mysql-test/suite/rpl/r/rpl_row_sp006_InnoDB.result
Text conflict in mysql-test/suite/rpl/r/rpl_stm_log.result
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_circular_simplex.result
Text conflict in mysql-test/suite/rpl_ndb/r/rpl_ndb_sp006.result
Text conflict in mysql-test/t/mysqlbinlog.test
Text conflict in sql/CMakeLists.txt
Text conflict in sql/Makefile.am
Text conflict in sql/log_event_old.cc
Text conflict in sql/rpl_rli.cc
Text conflict in sql/slave.cc
Text conflict in sql/sql_binlog.cc
Text conflict in sql/sql_lex.h
21 conflicts encountered.
NOTE
====
mysql-5.1-rpl-merge has been made a mirror of mysql-next-mr:
- "mysql-5.1-rpl-merge$ bzr pull ../mysql-next-mr"
This is the first cset (merge/...) committed after pulling
from mysql-next-mr.
Use compiler provided atomic builtins as a 'backend' for
MySQL's atomic primitives. The builtins are available on
a handful of platforms and compilers.
rename *.t* to *-t* to be automake-friendly
simplify Makefiles
test_atomic.c:
move to unittest, add GPL comment, fix warnings, convert to tap framework.
configure:
remove custom tests for available types, use AC_CHECK_TYPE instead
x86-gcc.h:
fix gcc -ansi errors while maintaining readability
ignore:
added *-t