MariaDB server crashes on ARM (weak memory model architecture) while
concurrently executing l_find to load node->key and add_to_purgatory
to store node->key = NULL. l_find then uses key (which is NULL), to
pass it to a comparison function.
The specific problem is the out-of-order execution that happens on a
weak memory model architecture. Two essential reorderings are possible,
which need to be prevented.
a) As l_find has no barriers in place between the optimistic read of
the key field lf_hash.cc#L117 and the verification of link lf_hash.cc#L124,
the processor can reorder the load to happen after the while-loop.
In that case, a concurrent thread executing add_to_purgatory on the same
node can be scheduled to store NULL at the key field lf_alloc-pin.c#L253
before key is loaded in l_find.
b) A node is marked as deleted by a CAS in l_delete lf_hash.cc#L247 and
taken off the list with an upfollowing CAS lf_hash.cc#L252. Only if both
CAS succeed, the key field is written to by add_to_purgatory. However,
due to a missing barrier, the relaxed store of key lf_alloc-pin.c#L253
can be moved ahead of the two CAS operations, which makes the value of
the local purgatory list stored by add_to_purgatory visible to all threads
operating on the list. As the node is not marked as deleted yet, the
same error occurs in l_find.
This change three accesses to be atomic.
* optimistic read of key in l_find lf_hash.cc#L117
* read of link for verification lf_hash.cc#L124
* write of key in add_to_purgatory lf_alloc-pin.c#L253
Reviewers: Sergei Vojtovich, Sergei Golubchik
Fixes: MDEV-23510 / d30c1331a18d875e553f3fcf544997e4f33fb943
Some architectures (mips) require libatomic to support proper
atomic operations. Check first if support is available without
linking, otherwise use the library.
Contributors:
James Cowgill <jcowgill@debian.org>
Jessica Clarke <jrtc27@debian.org>
Vicențiu Ciorbaru <vicentiu@mariadb.org>
Create minidump when server fails to shutdown. If process is being
debugged, cause a debug break.
Moves some code which is part of safe_kill into mysys, as both safe_kill,
and mysqltest produce minidumps on different timeouts.
Small cleanup in wait_until_dead() - replace inefficient loop with a single
wait.
Thanks to Fabian Vogt for noticing the mutual exclusions
of these open flags on tmpfs caused by mariadb opening it
incorrectly.
As such we clear the O_CREAT flag while opening it as O_TMPFILE.
This is a side-effect of my_large_malloc() introduction,MDEV-18851
It removed a cast to size_t to variable 'blocks' in
multiplication blocks * keycache->key_cache_block_size , creating ulong value
instead of correct size_t.
Replaced a couple of ulongs with appropriate data type, which is size_t.
Also, fixed casts to ulongs in crash handler messages, so that people would
not be confused by that, too.
Interestingly, aria did not expose the same problem even if it contains
copied and pasted code in ma_pagecache, because Aria had some ulongs removed
when fixing a similar problem in MDEV-9256.
init_mutex_v1_t: Stop lying that the mutex parameter is const.
GCC 11.2.0 assumes that it is and could complain about any mysql_mutex_t
being uninitialized even after mysql_mutex_init() as long as
PLUGIN_PERFSCHEMA is enabled.
init_rwlock_v1_t, init_cond_v1_t: Remove untruthful const qualifiers.
Note: init_socket_v1_t is expecting that the socket fd has already
been created before PSI_SOCKET_CALL(init_socket), and therefore that
parameter really is being treated as a pointer to const.
my_init_atomic_write(): Detect all forms of SSD, in case multiple
types of devices are installed in the same machine.
This was broken in commit ed008a74cf
and further in commit 70684afef2.
SAME_DEV(): Match block devices, ignoring partition numbers.
Let us use stat() instead of lstat(), in case someone has a symbolic
link in /dev.
Instead of reporting errors with perror(), let us use fprintf(stderr)
with the file name, the impact of the error, and the strerror(errno).
Because this code is specific to Linux, we may depend on the
GNU libc/uClibc/musl extension %m for strerror(errno).
This needs backporting to MariaDB 10.5.
Any changes I submit are freely available under the new BSD
license.
Signed-off-by: Nia Alarie <nia@NetBSD.org>
Removed redundant code for BF abort transaction in `thr_lock.cc`.
TOI operations will ignore provided lock_wait_timeout and use `LONG_TIMEOUT`
until operation is finished.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.
The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
memory access of integers. Fixed by using byte_order_generic.h when
compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
safe to have overflows (two cases, in item_func.cc).
Things fixed:
- Don't left shift signed values
(byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
constructors. This was needed as UBSAN checks that these types has
correct values when one copies an object.
(gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
deleted objects.
(events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
on Query_arena object.
- Fixed several cast of objects to an incompatible class!
(Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
This includes also ++ and -- of integers.
(Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
value_type is initialized to this instead of to -1, which is not a valid
enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
instead of a null string (safer as it ensures we do not do arithmetic
on null strings).
Other things:
- Changed struct st_position to an OBJECT and added an initialization
function to it to ensure that we do not copy or use uninitialized
members. The change to a class was also motived that we used "struct
st_position" and POSITION randomly trough the code which was
confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr. (This variable was before
only in 10.5 and up). It can now have one of two values:
ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
it virtual. This was an effort to get UBSAN to work with loaded storage
engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
to integer arithmetic.
Changes that should not be needed but had to be done to suppress warnings
from UBSAN:
- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
some compile time warnings.
Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
It seems some overly tolerant compilers (gcc) allow the structure
of IO_CACHE that is defined differently in libmaria to have
members equalivance to the iocache in mysys.
More strict Solaris compilers recognise that rc_pos really
isn't a structure member and won't compile.
In commit d25f806d73 (MDEV-22749)
the CRC-32C implementation of MariaDB was broken on some
IA-32 and AMD64 builds, depending on the compiler version and
build options. This was verified for IA-32 on GCC 10.2.1.
Even though we try to identify the SSE4.2 extensions and the
availaibility of the PCLMULQDQ instruction by executing CPUID,
the fall-back code could be generated with extended instructions,
because the entire file mysys/crc32/crc32c.c was being compiled
with -msse4.2 -mpclmul. This would cause SIGILL on a PINSRD
instruction on affected IA-32 targets (such as some Intel Atom
processors). This might also affect old AMD64 processors
(predating the 2007 Intel Nehalem microarchitecture), if some
compiler chose to emit the offending instructions.
While it is fine to pass a target-specific option to a target-specific
compilation unit (like -mpclmul to a PCLMUL-specific compilation unit),
that is not safe for mixed-architecture compilation units.
For mixed-architecture compilation units, the correct way is to set
target attributes on the target-specific functions.
There does not seem to be a way to pass target attributes to
function template instantiation. Hence, we must replace the
ExtendImpl template with plain functions crc32_sse42() and
crc32_slow().
We will also remove some inconsistency between
my_crc32_implementation() and mysys_namespace::crc32::Choose_Extend().
The function crc32_pclmul_enabled() will be moved to mysys/crc32/crc32c.cc
so that the detection code will be compiled without -msse4.2 -mpclmul.
The AMD64 PCLMUL accelerated crc32c_3way() will be moved to a new
file crc32c_amd64.cc. In this way, only a few functions that depend
on -msse4.2 in mysys/crc32/crc32c.cc can be declared with
__attribute__((target("sse4.2"))), and most of the file can be compiled
for the generic target.
Last, the file mysys/crc32ieee.cc will be omitted on 64-bit POWER,
because it was dead code (no symbols were exported).
Reviewed by: Vladislav Vaintroub