Cache results of expensive my_get_stack_bounds(), instead of calling
it for every query in thread_attach()
See MDEV-27943 for effort to reduce thread_attach() overhead
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.
It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.
The following code shows the issue
Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
(connect=0x5080000027b8,
put_in_cache=<optimized out>) at sql/sql_connect.cc:1399
THD *thd;
1399 thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0
The address of thd is 24M away from the stack pointer
(gdb) info reg
...
rsp 0x7fffef4e7bc0 0x7fffef4e7bc0
...
r13 0x7fffedee7060 140737185214560
r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer
I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.
To solve this issue in a portable way, I have added two functions:
my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()
As a fallback for other compilers/arch we use the address of a local
variable.
my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.
Server changes are:
- Moving setting of thread_stack to THD::store_globals() using
my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
allocates a lot on the stack before calling store_globals(). When
using estimates for stack start, we reduce stack_size with
MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
before calling store_globals().
I also added a unittest, stack_allocation-t, to verify the new code.
Reviewed-by: Sergei Golubchik <serg@mariadb.org>
The error was specific to threadpool/compressed protocol.
set_thd_idle() set socket state to idle twice, causing assert failure.
This happens if unread compressed data on connection,after query was
finished. On a protocol level, this means a single compression packet
contains multiple command packets.
Avoid relatively expensive THD::store_globals() for every query in the
threadpool. Use a lighter version instead, that only resets some thread
local storage variables(THD, mysys, PSI), avoids some calculationms
and caches syscall gettid (Linux only) in a thread_local variable.
Also simplify Worker_context use, with RAII.
Do not repeat yourself.
Instead of having the same DBUG_EXECUTE_IF code in threadpool and
thread-per-connection, add this code to setup_connection_thread_globals()
which is executed in all scheduling modes.
Prior to this patch, it is possible to access freed memory
(THD::event_scheduler) from tp_post_kill_notification().
With this patch, memory is freed only when THD is no more accessible
from other threads, i.e after it is removed from the thread_list.
Amend check for unread client data in threadpool.
THD::NET will have unread data, in case client uses compression, and
wraps multiple commands into a single compression packet
MariaDB C/C sends COM_STMT_RESET+COM_STMT_EXECUTE, and wraps it into
a single compressed packet, when compression is on, thus trying to use
compression and prepared statements against a threadpool-enabled server
will result into a hang, before this patch.
Due to restricted size of the threadpool, execution of client queries can
be delayed (queued) for a while. This delay was interpreted as client
inactivity, and connection is closed, if client idle time + queue time
exceeds wait_timeout.
But users did not expect queue time to be included into wait_timeout.
This patch changes the behavior. We don't close connection anymore,
if there is some unread data present on connection,
even if wait_timeout is exceeded. Unread data means that client
was not idle, it sent a query, which we did not have time to process yet.
This patch reduces the overhead of system calls prior to a query, for
threadpool. Previously, 3 system calls were done
1. WSARecv() to get notification of input data from client, asynchronous
equivalent of select() in one-thread-per-connection
2. recv(4 bytes) - reading packet header length
3. recv(packet payload)
Now there will be usually, just WSARecv(), which pre-reads user data into
a buffer, so we spared 2 syscalls
Profiler shows the most expensive call WSARecv(16%CPU) becomes 4% CPU,
after the patch, benchmark results (network heavy ones like point-select)
improve by ~20%
The buffer management was rather carefully done to keep
buffers together, as Windows would keeps the pages pinned
in memory for the duration of async calls.
At most 1MB memory is used for the buffers, and overhead per-connection is
only 256 bytes, which should cover most of the uses.
SSL does not yet use the optmization, so far it does not properly use
VIO for reads and writes. Neither one-thread-per-connection would get any
benefit, but that should be fine, it is not even default on Windows.
All changes (except one) is of type
thd->transaction. -> thd->transaction->
thd->transaction points by default to 'thd->default_transaction'
This allows us to 'easily' have multiple active transactions for a
THD object, like when reading data from the mysql.proc table
read TLS with my_thread_var
write TLS with set_mysys_var()
my_thread_var is no longer __attribute__ ((const)): this attribute
is simply incorrect here. Read gcc manual for more information.
sql/threadpool_generic.cc fails with that attribute.
MariaDB 10.4 was crashing when thread-handling was set to
pool-of-threads and wsrep was enabled.
There were two apparent reasons for the crash:
- Connection handling in threadpool_common.cc was missing calls to
control wsrep client state.
- Thread specific storage which contains thread variables (THR_KEY_mysys)
was not handled appropriately by wsrep patch when pool-of-threads
was configured.
This patch addresses the above issues in the following way:
- Wsrep client state open/close was moved in thd_prepare_connection() and
end_connection() to have common handling for one-thread-per-connection
and pool-of-threads.
- Thread local storage handling in wsrep patch was reworked by introducing
set of wsrep_xxx_threadvars() calls which replace calls to
THD store_globals()/reset_globals() and deal with thread handling
specifics internally.
Wsrep-lib was updated to version which relaxes internal concurrency
related sanity checks.
Rollback code from wsrep_rollback_process() was extracted to separate calls
for better readability.
Post rollback thread was removed as it was completely unused.
Plugin fixed to not lock the LOCK_operations when not active.
Server fixed to lock the LOCK_plugin less - do it once per
thread and then only if a plugin was installed/uninstalled.