Commit graph

105 commits

Author SHA1 Message Date
Vladislav Vaintroub
a2a0adbfc3 MDEV-34533 post-fix
Cache results of expensive my_get_stack_bounds(), instead of calling
it for every query in thread_attach()

See MDEV-27943 for effort to reduce thread_attach() overhead
2024-11-05 21:32:48 +01:00
Oleksandr Byelkin
f00711bba2 Merge branch '10.5' into 10.6 2024-10-29 14:20:03 +01:00
Monty
bddbef3573 MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.

It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.

The following code shows the issue

Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
    (connect=0x5080000027b8,
    put_in_cache=<optimized out>) at sql/sql_connect.cc:1399

THD *thd;
1399      thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0

The address of thd is 24M away from the stack pointer

(gdb) info reg
...
rsp            0x7fffef4e7bc0      0x7fffef4e7bc0
...
r13            0x7fffedee7060      140737185214560

r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer

I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.

To solve this issue in a portable way, I have added two functions:

my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()

As a fallback for other compilers/arch we use the address of a local
variable.

my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.

Server changes are:

- Moving setting of thread_stack to THD::store_globals() using
  my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
  allocates a lot on the stack before calling store_globals().  When
  using estimates for stack start, we reduce stack_size with
  MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
  before calling store_globals().

I also added a unittest, stack_allocation-t, to verify the new code.

Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-16 17:24:46 +03:00
Marko Mäkelä
7e0afb1c73 Merge 10.5 into 10.6 2024-10-03 09:31:39 +03:00
Oleksandr Byelkin
c9f54e20d4 MDEV-33990: SHOW STATUS counts ER_CON_COUNT_ERROR as Connection_errors_internal
Bring info about cause of closing connection in the place where we increment
statistics to do it correctly.
2024-09-23 16:16:51 +02:00
Oleksandr Byelkin
b83c379420 Merge branch '10.5' into 10.6 2023-11-08 15:57:05 +01:00
Oleksandr Byelkin
6cfd2ba397 Merge branch '10.4' into 10.5 2023-11-08 12:59:00 +01:00
Vladislav Vaintroub
87d1ab9ad9 MDEV-28561 Assertion failed: !pfs->m_idle || (state == PSI_SOCKET_STATE_ACTIVE)
The error was specific to threadpool/compressed protocol.
set_thd_idle() set socket state to idle twice, causing assert failure.

This happens if unread compressed data on connection,after query was
finished. On a protocol level, this means a single compression packet
contains multiple command packets.
2023-09-29 09:30:49 +02:00
Vladislav Vaintroub
a2a08eda8a MDEV-27943 - reduce calls to mysql_socket_set_thread_owner() in threadpool
It is enough to just once, during connection phase
2023-09-27 15:02:54 +02:00
Vladislav Vaintroub
15edd69ddf MDEV-27943 Reduce overhead of attaching THD to OS thread, in threadpool
Avoid relatively expensive THD::store_globals() for every query in the
threadpool. Use a lighter version instead, that only resets some thread
local storage variables(THD, mysys, PSI), avoids some calculationms
and caches syscall gettid (Linux only) in a thread_local variable.

Also simplify Worker_context use, with RAII.
2022-10-11 00:08:54 +02:00
Vladislav Vaintroub
7d63f21693 Merge branch '10.5' into 10.6
# Conflicts:
#	sql/sql_connect.cc
#	sql/threadpool_common.cc
2022-09-07 16:39:30 +02:00
Vladislav Vaintroub
80cf7a4c43 Merge branch '10.4' into 10.5
# Conflicts:
#	sql/sql_connect.cc
2022-09-07 15:28:58 +02:00
Vladislav Vaintroub
9a8faeea14 MDEV-18353 - minor cleanup
Do not repeat yourself.

Instead of having the same DBUG_EXECUTE_IF code in threadpool and
thread-per-connection, add this code to setup_connection_thread_globals()
which is executed in all scheduling modes.
2022-09-07 13:49:49 +02:00
Oleksandr Byelkin
6efb5e9f5e Merge branch '10.5' into 10.6 2021-08-02 10:11:41 +02:00
Oleksandr Byelkin
ae6bdc6769 Merge branch '10.4' into 10.5 2021-07-31 23:19:51 +02:00
Sergei Golubchik
4533e6ef65 MDEV-18353 Shutdown may miss to wait for connection thread
* count CONNECT objects too
* wait for all CONNECT objects to disappear (to be converted to THDs)
  before killing THDs away
2021-07-24 15:08:08 +02:00
Vladislav Vaintroub
4df0249b9a MDEV-24341 Innodb - do not block in foreground thread in log_write_up_to( 2021-02-14 18:30:39 +01:00
Marko Mäkelä
589cf8dbf3 Merge 10.3 into 10.4 2020-12-01 19:51:14 +02:00
Vladislav Vaintroub
08b0b70daa MDEV-24084 Fix race between disconnect and KILL CONNECTION
Prior to this patch, it is possible to access freed memory
(THD::event_scheduler) from tp_post_kill_notification().

With this patch, memory is freed only when THD is no more accessible
from other threads, i.e after it is removed from the thread_list.
2020-11-24 08:45:37 +01:00
Marko Mäkelä
09a1f0075a Merge 10.5 into 10.6 2020-11-02 12:49:19 +02:00
Marko Mäkelä
1657b7a583 Merge 10.4 to 10.5 2020-10-22 17:08:49 +03:00
Marko Mäkelä
46957a6a77 Merge 10.3 into 10.4 2020-10-22 13:27:18 +03:00
Marko Mäkelä
e3d692aa09 Merge 10.2 into 10.3 2020-10-22 08:26:28 +03:00
Vladislav Vaintroub
dab56d5e8e MDEV-23879 server hangs with threadpool, compression, and client pipelining
Amend check for unread client data in threadpool.

THD::NET will have unread data, in case client uses compression, and
wraps multiple commands into  a single compression packet

MariaDB C/C sends COM_STMT_RESET+COM_STMT_EXECUTE, and wraps it into
a single compressed packet, when compression is on, thus trying to use
compression and prepared statements against a threadpool-enabled server
will result into a hang, before this patch.
2020-10-03 00:24:53 +02:00
Marko Mäkelä
9a7948e3f6 Merge 10.5 into 10.6 2020-08-04 07:55:16 +03:00
Marko Mäkelä
50a11f396a Merge 10.4 into 10.5 2020-08-01 14:42:51 +03:00
Marko Mäkelä
9216114ce7 Merge 10.3 into 10.4 2020-07-31 18:09:08 +03:00
Vladislav Vaintroub
71015d844e MDEV-21101 unexpected wait_timeout with pool-of-threads
Due to restricted size of the threadpool, execution of client queries can
be delayed (queued) for a while. This delay was interpreted as client
inactivity, and connection is closed, if client idle time + queue time
exceeds wait_timeout.

But users did not expect queue time to be included into wait_timeout.

This patch changes the behavior. We don't close connection anymore,
if there is some unread data present on connection,
even if wait_timeout is exceeded. Unread data means that client
was not idle, it sent a query, which we did not have time to process yet.
2020-07-30 10:17:45 +02:00
Vladislav Vaintroub
272828a171 Merge branch '10.5' into 10.6 2020-07-04 11:53:26 +02:00
Vladislav Vaintroub
d15c839c0d MDEV-22990 Threadpool : Optimize network/named pipe IO for Windows
This patch reduces the overhead of system calls prior to a query, for
threadpool.  Previously, 3 system calls were done

1. WSARecv() to get notification of input data from client, asynchronous
equivalent of select() in one-thread-per-connection

2. recv(4 bytes) - reading packet header length
3. recv(packet payload)

Now there will be usually, just WSARecv(), which pre-reads user data into
a buffer, so we spared 2 syscalls

Profiler shows the most expensive call WSARecv(16%CPU) becomes 4% CPU,
after the patch, benchmark results (network heavy ones like point-select)
improve by ~20%

The buffer management was rather carefully done to keep
buffers together, as Windows would keeps the pages pinned
in memory for the duration of async calls.
At most 1MB memory is used for the buffers, and overhead per-connection is
only 256 bytes, which should cover most of the uses.

SSL does not yet use the optmization, so far it does not properly use
VIO for reads and writes. Neither one-thread-per-connection would get any
benefit, but that should be fine, it is not even default on Windows.
2020-06-26 14:44:36 +02:00
Vladislav Vaintroub
213265130e Remove some trailing whitespaces. 2020-05-29 13:05:35 +02:00
Vladislav Vaintroub
9aa6042a0d MDEV-22696 Threadpool : make sure thd->event_scheduler.data does not change as long as THD is in server_threads. 2020-05-25 14:54:11 +02:00
Monty
d1d472646d Change THD->transaction to a pointer to enable multiple transactions
All changes (except one) is of type
thd->transaction.  -> thd->transaction->

thd->transaction points by default to 'thd->default_transaction'
This allows us to 'easily' have multiple active transactions for a
THD object, like when reading data from the mysql.proc table
2020-05-23 12:29:10 +03:00
Eugene Kosov
89ff4176c1 MDEV-22437 make THR_THD* variable thread_local
Now all access goes through _current_thd() and set_current_thd()
functions.

Some functions like THD::store_globals() can not fail now.
2020-05-05 18:13:31 +03:00
Marko Mäkelä
496d0372ef Merge 10.4 into 10.5 2020-04-29 15:40:51 +03:00
Marko Mäkelä
b63446984c Merge 10.3 into 10.4 2020-04-27 17:38:17 +03:00
Marko Mäkelä
2e12d471ea Merge 10.2 into 10.3 2020-04-27 14:24:41 +03:00
Eugene Kosov
2c5067b689 cleanup THR_KEY_mysys
read TLS with my_thread_var
write TLS with set_mysys_var()

my_thread_var is no longer __attribute__ ((const)): this attribute
is simply incorrect here. Read gcc manual for more information.
sql/threadpool_generic.cc fails with that attribute.
2020-04-25 00:55:39 +03:00
Marko Mäkelä
37c14690fc Merge 10.4 into 10.5 2020-03-30 19:07:25 +03:00
Sergei Golubchik
7180afa094 fix perfschema for pool-of-threads 2020-03-10 19:24:24 +01:00
Marko Mäkelä
28c89b7151 Merge 10.4 into 10.5 2019-12-16 07:47:17 +02:00
Oleksandr Byelkin
a15234bf4b Merge branch '10.3' into 10.4 2019-12-09 15:09:41 +01:00
Faustin Lammler
2df2238cb8 Lintian complains on spelling error
The lintian check complains on spelling error:
https://salsa.debian.org/mariadb-team/mariadb-10.3/-/jobs/95739
2019-12-02 12:41:13 +02:00
Marko Mäkelä
780d2bb8a7 Merge 10.4 into 10.5 2019-09-06 14:25:20 +03:00
Teemu Ollakka
9487e0b259 MDEV-19826 10.4 seems to crash with "pool-of-threads" ()
MariaDB 10.4 was crashing when thread-handling was set to
pool-of-threads and wsrep was enabled.

There were two apparent reasons for the crash:
- Connection handling in threadpool_common.cc was missing calls to
  control wsrep client state.
- Thread specific storage which contains thread variables (THR_KEY_mysys)
  was not handled appropriately by wsrep patch when pool-of-threads
  was configured.

This patch addresses the above issues in the following way:
- Wsrep client state open/close was moved in thd_prepare_connection() and
  end_connection() to have common handling for one-thread-per-connection
  and pool-of-threads.
- Thread local storage handling in wsrep patch was reworked by introducing
  set of wsrep_xxx_threadvars() calls which replace calls to
  THD store_globals()/reset_globals() and deal with thread handling
  specifics internally.

Wsrep-lib was updated to version which relaxes internal concurrency
related sanity checks.

Rollback code from wsrep_rollback_process() was extracted to separate calls
for better readability.

Post rollback thread was removed as it was completely unused.
2019-08-30 08:42:24 +03:00
Marko Mäkelä
624dd71b94 Merge 10.4 into 10.5 2019-08-13 18:57:00 +03:00
Marko Mäkelä
7a3d34d645 Merge 10.3 into 10.4 2019-07-02 21:44:58 +03:00
Marko Mäkelä
e82fe21e3a Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
Eugene Kosov
ddeeb42e0b Merge 10.1 into 10.2 2019-06-23 20:33:13 +03:00
Alexey Botchkov
65e0c9b91b MDEV-18661 loading the audit plugin causes performance regression.
Plugin fixed to not lock the LOCK_operations when not active.
Server fixed to lock the LOCK_plugin less - do it once per
thread and then only if a plugin was installed/uninstalled.
2019-06-15 01:02:55 +04:00