Commit graph

2 commits

Author SHA1 Message Date
Martin Beck
4ebac0fc86 MDEV-27088: Server crash on ARM (WMM architecture) due to missing barriers in lf-hash (10.5)
MariaDB server crashes on ARM (weak memory model architecture) while
concurrently executing l_find to load node->key and add_to_purgatory
to store node->key = NULL. l_find then uses key (which is NULL), to
pass it to a comparison function.

The specific problem is the out-of-order execution that happens on a
weak memory model architecture. Two essential reorderings are possible,
which need to be prevented.

a) As l_find has no barriers in place between the optimistic read of
the key field lf_hash.cc#L117 and the verification of link lf_hash.cc#L124,
the processor can reorder the load to happen after the while-loop.

In that case, a concurrent thread executing add_to_purgatory on the same
node can be scheduled to store NULL at the key field lf_alloc-pin.c#L253
before key is loaded in l_find.

b) A node is marked as deleted by a CAS in l_delete lf_hash.cc#L247 and
taken off the list with an upfollowing CAS lf_hash.cc#L252. Only if both
CAS succeed, the key field is written to by add_to_purgatory. However,
due to a missing barrier, the relaxed store of key lf_alloc-pin.c#L253
can be moved ahead of the two CAS operations, which makes the value of
the local purgatory list stored by add_to_purgatory visible to all threads
operating on the list. As the node is not marked as deleted yet, the
same error occurs in l_find.

This change three accesses to be atomic.

* optimistic read of key in l_find lf_hash.cc#L117
* read of link for verification lf_hash.cc#L124
* write of key in add_to_purgatory lf_alloc-pin.c#L253

Reviewers: Sergei Vojtovich, Sergei Golubchik

Fixes: MDEV-23510 / d30c1331a18d875e553f3fcf544997e4f33fb943
2021-11-30 15:16:16 +11:00
Daniel Black
e0ba68ba34 MDEV-23510: arm64 lf_hash alignment of pointers
Like the 10.2 version 1635686b50,
except C++ on internal functions for my_assume_aligned.

volatile != atomic.

volatile has no memory barrier schemantics, its for mmaped IO
so lets allow some optimizer gains and stop pretending it helps
with memory atomicity.

The MDEV lists a SEGV an assumption is made that an address was
partially read. As C packs structs strictly in order and on arm64 the
cache line size is 128 bits. A pointer (link - 64 bits), followed
by a hashnr (uint32 - 32 bits), leaves the following key (uchar *
64 bits), neither naturally aligned to any pointer and worse, split
across a cache line which is the processors view of an atomic
reservation of memory.

lf_dynarray_lvalue is assumed to return a 64 bit aligned address.

As a solution move the 32bit hashnr to the end so we don't get the
*key pointer split across two cache lines.

Tested by: Krunal Bauskar
Reviewer: Marko Mäkelä
2021-02-25 10:06:15 +11:00
Renamed from mysys/lf_hash.c (Browse further)