MariaDB server prints the stack information if a crash happens.
It traverses the stack frames in function `print_with_addr_resolve`.
For *EACH* frame, it tries to parse the file name and line number of the
frame using `addr2line`, or prints `backtrace_symbols_fd` if `addr2line`
fails.
1. Logic in `addr_resolve` function uses addr2line to get the file name
and line numbers. It has a timeout of 500ms to wait for the response
from addr2line. However, that's not enough on small instances
especially if the debug information is in a separate file or
compressed.
Increase the timeout to 5 seconds to support some edge cases, as
experiments showed addr2line may take 2-3 seconds on some frames.
2. While parsing a frame inside of a shared library using `addr2line`,
the file name and line numbers could be `??`, empty or `0` if the
debug info is not loaded.
It's easy to reproduce when glibc-debuginfo is not installed.
Instead of printing a meaningless frame like:
:0(__GI___poll)[0x1505e9197639]
...
??:0(__libc_start_main)[0x7ffff6c8913a]
We want to print the frame information using `backtrace_symbols_fd`,
with the shared library name and a hexadecimal offset.
Stacktrace example on a real instance with this commit:
/lib64/libc.so.6(__poll+0x49)[0x145cbf71a639]
...
/lib64/libc.so.6(__libc_start_main+0xea)[0x7f4d0034d13a]
`addr_resolve` has considered the case of meaningless combination of
file name and line number returned by `addr2line`. e.g. `??:?`
However, conditions like `:0` and `??:0` are not handled. So now the
function will rollback to `backtrace_symbols_fd` in above cases.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
Replace calls to `sprintf` and `strcpy` by the safer options `snprintf`
and `safe_strcpy` in the following directories:
- libmysqld
- mysys
- sql-common
- strings
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer
Amazon Web Services, Inc.
When trying to output stacktrace, and addr2line is not installed, the
child process forked by start_addr2line_fork() will fail to do exec(),
and finish with exit(1).
There is a problem with exit() though - it runs exit handlers,
and for the forked copy of crashing process, it is a bad idea.
In 10.5+ code for example, exit handlers include
tpool::task_group static destructors, and it will hang infinitely
waiting for completion of the outstanding tasks.
The fix is to use _exit() instead, which skips the execution of exit
handlers
on Linux this pthread_attr_setstacksize() fails with EINVAL
"The stack size is less than PTHREAD_STACK_MIN (16384) bytes".
But on FreeBSD it succeeds and causes a crash later, as 8196 is too little.
Let's keep the stack at its default size in the timer thread.
In commit 28325b0863
a compile-time option was introduced to disable the macros
DBUG_ENTER and DBUG_RETURN or DBUG_VOID_RETURN.
The parameter name WITH_DBUG_TRACE would hint that it also
covers DBUG_PRINT statements. Let us do that: WITH_DBUG_TRACE=OFF
shall disable DBUG_PRINT() as well.
A few InnoDB recovery tests used to check that some output from
DBUG_PRINT("ib_log", ...) is present. We can live without those checks.
Reviewed by: Vladislav Vaintroub
Take into account that in preparation of a simple key cache for resizing no disk blocks might be assigned to it.
Reviewer: IgorBabaev <igor@mariadb.com>
On affected machine, the error happens sporadically in
innodb.instant_alter_limit.
Procmon shows SetRenameInformationFile failing with ERROR_ACCESS_DENIED.
In this case, the destination file was previously opened rsp oplocked by
Windows defender antivirus.
The fix is to retry MoveFileEx on ERROR_ACCESS_DENIED.
Small postfix to MDEV-23175 to ensure faster option on FreeBSD
and compatibility to Solaris that isn't high resolution.
ftime is left as a backup in case an implementation doesn't
contain any of these clocks.
FreeBSD
$ ./unittest/mysys/my_rdtsc-t
1..11
# ----- Routine ---------------
# myt.cycles.routine : 5
# myt.nanoseconds.routine : 11
# myt.microseconds.routine : 13
# myt.milliseconds.routine : 11
# myt.ticks.routine : 17
# ----- Frequency -------------
# myt.cycles.frequency : 3610295566
# myt.nanoseconds.frequency : 1000000000
# myt.microseconds.frequency : 1000000
# myt.milliseconds.frequency : 899
# myt.ticks.frequency : 136
# ----- Resolution ------------
# myt.cycles.resolution : 1
# myt.nanoseconds.resolution : 1
# myt.microseconds.resolution : 1
# myt.milliseconds.resolution : 7
# myt.ticks.resolution : 1
# ----- Overhead --------------
# myt.cycles.overhead : 26
# myt.nanoseconds.overhead : 19140
# myt.microseconds.overhead : 19036
# myt.milliseconds.overhead : 578
# myt.ticks.overhead : 21544
ok 1 - my_timer_init() did not crash
ok 2 - The cycle timer is strictly increasing
ok 3 - The cycle timer is implemented
ok 4 - The nanosecond timer is increasing
ok 5 - The nanosecond timer is implemented
ok 6 - The microsecond timer is increasing
ok 7 - The microsecond timer is implemented
ok 8 - The millisecond timer is increasing
ok 9 - The millisecond timer is implemented
ok 10 - The tick timer is increasing
ok 11 - The tick timer is implemented
Small postfix to MDEV-23175 to ensure faster option on FreeBSD
and compatibility to Solaris that isn't high resolution.
ftime is left as a backup in case an implementation doesn't
contain any of these clocks.
FreeBSD
$ ./unittest/mysys/my_rdtsc-t
1..11
# ----- Routine ---------------
# myt.cycles.routine : 5
# myt.nanoseconds.routine : 11
# myt.microseconds.routine : 13
# myt.milliseconds.routine : 11
# myt.ticks.routine : 17
# ----- Frequency -------------
# myt.cycles.frequency : 3610295566
# myt.nanoseconds.frequency : 1000000000
# myt.microseconds.frequency : 1000000
# myt.milliseconds.frequency : 899
# myt.ticks.frequency : 136
# ----- Resolution ------------
# myt.cycles.resolution : 1
# myt.nanoseconds.resolution : 1
# myt.microseconds.resolution : 1
# myt.milliseconds.resolution : 7
# myt.ticks.resolution : 1
# ----- Overhead --------------
# myt.cycles.overhead : 26
# myt.nanoseconds.overhead : 19140
# myt.microseconds.overhead : 19036
# myt.milliseconds.overhead : 578
# myt.ticks.overhead : 21544
ok 1 - my_timer_init() did not crash
ok 2 - The cycle timer is strictly increasing
ok 3 - The cycle timer is implemented
ok 4 - The nanosecond timer is increasing
ok 5 - The nanosecond timer is implemented
ok 6 - The microsecond timer is increasing
ok 7 - The microsecond timer is implemented
ok 8 - The millisecond timer is increasing
ok 9 - The millisecond timer is implemented
ok 10 - The tick timer is increasing
ok 11 - The tick timer is implemented
MariaDB server crashes on ARM (weak memory model architecture) while
concurrently executing l_find to load node->key and add_to_purgatory
to store node->key = NULL. l_find then uses key (which is NULL), to
pass it to a comparison function.
The specific problem is the out-of-order execution that happens on a
weak memory model architecture. Two essential reorderings are possible,
which need to be prevented.
a) As l_find has no barriers in place between the optimistic read of
the key field lf_hash.cc#L117 and the verification of link lf_hash.cc#L124,
the processor can reorder the load to happen after the while-loop.
In that case, a concurrent thread executing add_to_purgatory on the same
node can be scheduled to store NULL at the key field lf_alloc-pin.c#L253
before key is loaded in l_find.
b) A node is marked as deleted by a CAS in l_delete lf_hash.cc#L247 and
taken off the list with an upfollowing CAS lf_hash.cc#L252. Only if both
CAS succeed, the key field is written to by add_to_purgatory. However,
due to a missing barrier, the relaxed store of key lf_alloc-pin.c#L253
can be moved ahead of the two CAS operations, which makes the value of
the local purgatory list stored by add_to_purgatory visible to all threads
operating on the list. As the node is not marked as deleted yet, the
same error occurs in l_find.
This change three accesses to be atomic.
* optimistic read of key in l_find lf_hash.cc#L117
* read of link for verification lf_hash.cc#L124
* write of key in add_to_purgatory lf_alloc-pin.c#L253
Reviewers: Sergei Vojtovich, Sergei Golubchik
Fixes: MDEV-23510 / d30c1331a18d875e553f3fcf544997e4f33fb943
Some architectures (mips) require libatomic to support proper
atomic operations. Check first if support is available without
linking, otherwise use the library.
Contributors:
James Cowgill <jcowgill@debian.org>
Jessica Clarke <jrtc27@debian.org>
Vicențiu Ciorbaru <vicentiu@mariadb.org>
Create minidump when server fails to shutdown. If process is being
debugged, cause a debug break.
Moves some code which is part of safe_kill into mysys, as both safe_kill,
and mysqltest produce minidumps on different timeouts.
Small cleanup in wait_until_dead() - replace inefficient loop with a single
wait.
Thanks to Fabian Vogt for noticing the mutual exclusions
of these open flags on tmpfs caused by mariadb opening it
incorrectly.
As such we clear the O_CREAT flag while opening it as O_TMPFILE.
init_mutex_v1_t: Stop lying that the mutex parameter is const.
GCC 11.2.0 assumes that it is and could complain about any mysql_mutex_t
being uninitialized even after mysql_mutex_init() as long as
PLUGIN_PERFSCHEMA is enabled.
init_rwlock_v1_t, init_cond_v1_t: Remove untruthful const qualifiers.
Note: init_socket_v1_t is expecting that the socket fd has already
been created before PSI_SOCKET_CALL(init_socket), and therefore that
parameter really is being treated as a pointer to const.
Removed redundant code for BF abort transaction in `thr_lock.cc`.
TOI operations will ignore provided lock_wait_timeout and use `LONG_TIMEOUT`
until operation is finished.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.
The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
memory access of integers. Fixed by using byte_order_generic.h when
compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
safe to have overflows (two cases, in item_func.cc).
Things fixed:
- Don't left shift signed values
(byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
constructors. This was needed as UBSAN checks that these types has
correct values when one copies an object.
(gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
deleted objects.
(events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
on Query_arena object.
- Fixed several cast of objects to an incompatible class!
(Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
This includes also ++ and -- of integers.
(Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
value_type is initialized to this instead of to -1, which is not a valid
enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
instead of a null string (safer as it ensures we do not do arithmetic
on null strings).
Other things:
- Changed struct st_position to an OBJECT and added an initialization
function to it to ensure that we do not copy or use uninitialized
members. The change to a class was also motived that we used "struct
st_position" and POSITION randomly trough the code which was
confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr. (This variable was before
only in 10.5 and up). It can now have one of two values:
ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
it virtual. This was an effort to get UBSAN to work with loaded storage
engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
to integer arithmetic.
Changes that should not be needed but had to be done to suppress warnings
from UBSAN:
- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
some compile time warnings.
Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia