Docker when mounting a configuration file into a Windows exposes the
file with permission 0777. These world writable files are ignored by
by MariaDB.
Add the access check such that filesystem RO or immutable file is
counted as sufficient protection on the file.
Test:
$ mkdir /tmp/src
$ vi /tmp/src/my.cnf
$ chmod 666 /tmp/src/my.cnf
$ mkdir /tmp/dst
$ sudo mount --bind /tmp/src /tmp/dst -o ro
$ ls -la /tmp/dst
total 4
drwxr-xr-x. 2 dan dan 60 Jun 15 15:12 .
drwxrwxrwt. 25 root root 660 Jun 15 15:13 ..
-rw-rw-rw-. 1 dan dan 10 Jun 15 15:12 my.cnf
$ mount | grep dst
tmpfs on /tmp/dst type tmpfs (ro,seclabel,nr_inodes=1048576,inode64)
strace client/mariadb --defaults-file=/tmp/dst/my.cnf
newfstatat(AT_FDCWD, "/tmp/dst/my.cnf", {st_mode=S_IFREG|0666, st_size=10, ...}, 0) = 0
access("/tmp/dst/my.cnf", W_OK) = -1 EROFS (Read-only file system)
openat(AT_FDCWD, "/tmp/dst/my.cnf", O_RDONLY|O_CLOEXEC) = 3
The one failing test, but this isn't a regression, just not a total fix:
$ chmod u-w /tmp/src/my.cnf
$ ls -la /tmp/src/my.cnf
-r--rw-rw-. 1 dan dan 18 Jun 16 10:22 /tmp/src/my.cnf
$ strace -fe trace=access client/mariadb --defaults-file=/tmp/dst/my.cnf
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
access("/etc/system-fips", F_OK) = -1 ENOENT (No such file or directory)
access("/tmp/dst/my.cnf", W_OK) = -1 EACCES (Permission denied)
Warning: World-writable config file '/tmp/dst/my.cnf' is ignored
Windows test (Docker Desktop ~4.21) which was the important one to fix:
dan@LAPTOP-5B5P7RCK:~$ docker run --rm -v /mnt/c/Users/danie/Desktop/conf:/etc/mysql/conf.d/:ro -e MARIADB_ROOT_PASSWORD=bob quay.io/m
ariadb-foundation/mariadb-devel:10.4-MDEV-27038-ro-mounts-pkgtest ls -la /etc/mysql/conf.d
total 4
drwxrwxrwx 1 root root 512 Jun 15 13:57 .
drwxr-xr-x 4 root root 4096 Jun 15 07:32 ..
-rwxrwxrwx 1 root root 43 Jun 15 13:56 myapp.cnf
root@a59b38b45af1:/# strace -fe trace=access mariadb
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
access("/etc/mysql/conf.d/myapp.cnf", W_OK) = -1 EROFS (Read-only file system)
rw_trx_hash_t::find() acquires element->mutex, then unpins pins, used for
lf_hash element search. After that the "element" can be deallocated and
reused by some other thread.
If we take a look rw_trx_hash_t::insert()->lf_hash_insert()->lf_alloc_new()
calls, we will not find any element->mutex acquisition, as it was not
initialized yet before it's allocation. rw_trx_hash_t::insert() can reuse
the chunk, unpinned in rw_trx_hash_t::find().
The scenario is the following:
1. Thread 1 have just executed lf_hash_search() in
rw_trx_hash_t::find(), but have not acquired element->mutex yet.
2. Thread 2 have removed the element from hash table with
rw_trx_hash_t::erase() call.
3. Thread 1 acquired element->mutex and unpinned pin 2 pin with
lf_hash_search_unpin(pins) call.
4. Some thread purged memory of the element.
5. Thread 3 reused the memory for the element, filled element->id,
element->trx.
6. Thread 1 crashes with failed "DBUG_ASSERT(trx_id == trx->id)"
assertion.
Note that trx_t objects are also reused, see the code around trx_pools
for details.
The fix is to invoke "lf_hash_search_unpin(pins);" after element->trx is
stored in local variable in rw_trx_hash_t::find().
Reviewed by: Nikita Malyavin, Marko Mäkelä.
Modern software (including text editors, static analysis software,
and web-based code review interfaces) often requires source code files
to be interpretable via a consistent character encoding, with UTF-8 or
ASCII (a strict subset of UTF-8) as the default. Several of the MariaDB
source files contain bytes that are not valid in either the UTF-8 or
ASCII encodings, but instead represent strings encoded in the
ISO-8859-1/Latin-1 or ISO-8859-2/Latin-2 encodings.
These inconsistent encodings may prevent software from correctly
presenting or processing such files. Converting all source files to
valid UTF8 characters will ensure correct handling.
Comments written in Czech were replaced with lightly-corrected
translations from Google Translate. Additionally, comments describing
the proper handling of special characters were changed so that the
comments are now purely UTF8.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer
Amazon Web Services, Inc.
Co-authored-by: Andrew Hutchings <andrew@linuxjedi.co.uk>
MariaDB server prints the stack information if a crash happens.
It traverses the stack frames in function `print_with_addr_resolve`.
For *EACH* frame, it tries to parse the file name and line number of the
frame using `addr2line`, or prints `backtrace_symbols_fd` if `addr2line`
fails.
1. Logic in `addr_resolve` function uses addr2line to get the file name
and line numbers. It has a timeout of 500ms to wait for the response
from addr2line. However, that's not enough on small instances
especially if the debug information is in a separate file or
compressed.
Increase the timeout to 5 seconds to support some edge cases, as
experiments showed addr2line may take 2-3 seconds on some frames.
2. While parsing a frame inside of a shared library using `addr2line`,
the file name and line numbers could be `??`, empty or `0` if the
debug info is not loaded.
It's easy to reproduce when glibc-debuginfo is not installed.
Instead of printing a meaningless frame like:
:0(__GI___poll)[0x1505e9197639]
...
??:0(__libc_start_main)[0x7ffff6c8913a]
We want to print the frame information using `backtrace_symbols_fd`,
with the shared library name and a hexadecimal offset.
Stacktrace example on a real instance with this commit:
/lib64/libc.so.6(__poll+0x49)[0x145cbf71a639]
...
/lib64/libc.so.6(__libc_start_main+0xea)[0x7f4d0034d13a]
`addr_resolve` has considered the case of meaningless combination of
file name and line number returned by `addr2line`. e.g. `??:?`
However, conditions like `:0` and `??:0` are not handled. So now the
function will rollback to `backtrace_symbols_fd` in above cases.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
Replace calls to `sprintf` and `strcpy` by the safer options `snprintf`
and `safe_strcpy` in the following directories:
- libmysqld
- mysys
- sql-common
- strings
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer
Amazon Web Services, Inc.
When trying to output stacktrace, and addr2line is not installed, the
child process forked by start_addr2line_fork() will fail to do exec(),
and finish with exit(1).
There is a problem with exit() though - it runs exit handlers,
and for the forked copy of crashing process, it is a bad idea.
In 10.5+ code for example, exit handlers include
tpool::task_group static destructors, and it will hang infinitely
waiting for completion of the outstanding tasks.
The fix is to use _exit() instead, which skips the execution of exit
handlers
on Linux this pthread_attr_setstacksize() fails with EINVAL
"The stack size is less than PTHREAD_STACK_MIN (16384) bytes".
But on FreeBSD it succeeds and causes a crash later, as 8196 is too little.
Let's keep the stack at its default size in the timer thread.
In commit 28325b0863
a compile-time option was introduced to disable the macros
DBUG_ENTER and DBUG_RETURN or DBUG_VOID_RETURN.
The parameter name WITH_DBUG_TRACE would hint that it also
covers DBUG_PRINT statements. Let us do that: WITH_DBUG_TRACE=OFF
shall disable DBUG_PRINT() as well.
A few InnoDB recovery tests used to check that some output from
DBUG_PRINT("ib_log", ...) is present. We can live without those checks.
Reviewed by: Vladislav Vaintroub
Take into account that in preparation of a simple key cache for resizing no disk blocks might be assigned to it.
Reviewer: IgorBabaev <igor@mariadb.com>
On affected machine, the error happens sporadically in
innodb.instant_alter_limit.
Procmon shows SetRenameInformationFile failing with ERROR_ACCESS_DENIED.
In this case, the destination file was previously opened rsp oplocked by
Windows defender antivirus.
The fix is to retry MoveFileEx on ERROR_ACCESS_DENIED.
Small postfix to MDEV-23175 to ensure faster option on FreeBSD
and compatibility to Solaris that isn't high resolution.
ftime is left as a backup in case an implementation doesn't
contain any of these clocks.
FreeBSD
$ ./unittest/mysys/my_rdtsc-t
1..11
# ----- Routine ---------------
# myt.cycles.routine : 5
# myt.nanoseconds.routine : 11
# myt.microseconds.routine : 13
# myt.milliseconds.routine : 11
# myt.ticks.routine : 17
# ----- Frequency -------------
# myt.cycles.frequency : 3610295566
# myt.nanoseconds.frequency : 1000000000
# myt.microseconds.frequency : 1000000
# myt.milliseconds.frequency : 899
# myt.ticks.frequency : 136
# ----- Resolution ------------
# myt.cycles.resolution : 1
# myt.nanoseconds.resolution : 1
# myt.microseconds.resolution : 1
# myt.milliseconds.resolution : 7
# myt.ticks.resolution : 1
# ----- Overhead --------------
# myt.cycles.overhead : 26
# myt.nanoseconds.overhead : 19140
# myt.microseconds.overhead : 19036
# myt.milliseconds.overhead : 578
# myt.ticks.overhead : 21544
ok 1 - my_timer_init() did not crash
ok 2 - The cycle timer is strictly increasing
ok 3 - The cycle timer is implemented
ok 4 - The nanosecond timer is increasing
ok 5 - The nanosecond timer is implemented
ok 6 - The microsecond timer is increasing
ok 7 - The microsecond timer is implemented
ok 8 - The millisecond timer is increasing
ok 9 - The millisecond timer is implemented
ok 10 - The tick timer is increasing
ok 11 - The tick timer is implemented
Small postfix to MDEV-23175 to ensure faster option on FreeBSD
and compatibility to Solaris that isn't high resolution.
ftime is left as a backup in case an implementation doesn't
contain any of these clocks.
FreeBSD
$ ./unittest/mysys/my_rdtsc-t
1..11
# ----- Routine ---------------
# myt.cycles.routine : 5
# myt.nanoseconds.routine : 11
# myt.microseconds.routine : 13
# myt.milliseconds.routine : 11
# myt.ticks.routine : 17
# ----- Frequency -------------
# myt.cycles.frequency : 3610295566
# myt.nanoseconds.frequency : 1000000000
# myt.microseconds.frequency : 1000000
# myt.milliseconds.frequency : 899
# myt.ticks.frequency : 136
# ----- Resolution ------------
# myt.cycles.resolution : 1
# myt.nanoseconds.resolution : 1
# myt.microseconds.resolution : 1
# myt.milliseconds.resolution : 7
# myt.ticks.resolution : 1
# ----- Overhead --------------
# myt.cycles.overhead : 26
# myt.nanoseconds.overhead : 19140
# myt.microseconds.overhead : 19036
# myt.milliseconds.overhead : 578
# myt.ticks.overhead : 21544
ok 1 - my_timer_init() did not crash
ok 2 - The cycle timer is strictly increasing
ok 3 - The cycle timer is implemented
ok 4 - The nanosecond timer is increasing
ok 5 - The nanosecond timer is implemented
ok 6 - The microsecond timer is increasing
ok 7 - The microsecond timer is implemented
ok 8 - The millisecond timer is increasing
ok 9 - The millisecond timer is implemented
ok 10 - The tick timer is increasing
ok 11 - The tick timer is implemented
MariaDB server crashes on ARM (weak memory model architecture) while
concurrently executing l_find to load node->key and add_to_purgatory
to store node->key = NULL. l_find then uses key (which is NULL), to
pass it to a comparison function.
The specific problem is the out-of-order execution that happens on a
weak memory model architecture. Two essential reorderings are possible,
which need to be prevented.
a) As l_find has no barriers in place between the optimistic read of
the key field lf_hash.cc#L117 and the verification of link lf_hash.cc#L124,
the processor can reorder the load to happen after the while-loop.
In that case, a concurrent thread executing add_to_purgatory on the same
node can be scheduled to store NULL at the key field lf_alloc-pin.c#L253
before key is loaded in l_find.
b) A node is marked as deleted by a CAS in l_delete lf_hash.cc#L247 and
taken off the list with an upfollowing CAS lf_hash.cc#L252. Only if both
CAS succeed, the key field is written to by add_to_purgatory. However,
due to a missing barrier, the relaxed store of key lf_alloc-pin.c#L253
can be moved ahead of the two CAS operations, which makes the value of
the local purgatory list stored by add_to_purgatory visible to all threads
operating on the list. As the node is not marked as deleted yet, the
same error occurs in l_find.
This change three accesses to be atomic.
* optimistic read of key in l_find lf_hash.cc#L117
* read of link for verification lf_hash.cc#L124
* write of key in add_to_purgatory lf_alloc-pin.c#L253
Reviewers: Sergei Vojtovich, Sergei Golubchik
Fixes: MDEV-23510 / d30c1331a18d875e553f3fcf544997e4f33fb943
Some architectures (mips) require libatomic to support proper
atomic operations. Check first if support is available without
linking, otherwise use the library.
Contributors:
James Cowgill <jcowgill@debian.org>
Jessica Clarke <jrtc27@debian.org>
Vicențiu Ciorbaru <vicentiu@mariadb.org>
Create minidump when server fails to shutdown. If process is being
debugged, cause a debug break.
Moves some code which is part of safe_kill into mysys, as both safe_kill,
and mysqltest produce minidumps on different timeouts.
Small cleanup in wait_until_dead() - replace inefficient loop with a single
wait.
Thanks to Fabian Vogt for noticing the mutual exclusions
of these open flags on tmpfs caused by mariadb opening it
incorrectly.
As such we clear the O_CREAT flag while opening it as O_TMPFILE.
init_mutex_v1_t: Stop lying that the mutex parameter is const.
GCC 11.2.0 assumes that it is and could complain about any mysql_mutex_t
being uninitialized even after mysql_mutex_init() as long as
PLUGIN_PERFSCHEMA is enabled.
init_rwlock_v1_t, init_cond_v1_t: Remove untruthful const qualifiers.
Note: init_socket_v1_t is expecting that the socket fd has already
been created before PSI_SOCKET_CALL(init_socket), and therefore that
parameter really is being treated as a pointer to const.