The problem was that the introduction of max-thread-mem-used can cause
an allocation error very early, even before mysql_parse() is called.
As mysql_parse() calls thd->reset_for_next_command(), which called
clear_error(), the error number was lost.
Fixed by adding an option to have unique messages for each KILL
signal and change max-thread-mem-used to use this new feature.
This removes a lot of problems with the original approach, where
one could get errors signaled silenty almost any time.
ixed by moving clear_error() from reset_for_next_command() to
do_command(), before any memory allocation for the thread.
Related changes:
- reset_for_next_command() now have an optional parameter if we should
call clear_error() or not. By default it's called, but not anymore from
dispatch_command() which was the original problem.
- Added optional paramater to clear_error() to force calling of
reset_diagnostics_area(). Before clear_error() only called
reset_diagnostics_area() if there was no error, so we normally
called reset_diagnostics_area() twice.
- This change removed several duplicated calls to clear_error()
when starting a query.
- Reset max_mem_used on COM_QUIT, to protect against kill during
quit.
- Use fatal_error() instead of setting is_fatal_error (cleanup)
- Set fatal_error if max_thead_mem_used is signaled.
(Same logic we use for other places where we are out of resources)
bytes are possibly lost
Timer thread of threadpool is created "joinable", but they're not "joined" on
completion. This causes memory leaks around thread local storage.
Fixed by joining timer thread.
An addition to fix for MDEV-5205, fixes server crash on shutdown.
Thread groups are destroyed asynchronously, that is kill server
thread sends shutdown request to all thread groups without waiting
for compeltion.
It means all_groups array must not be freed until all thread groups
are down. This patch suggests that all_groups is freed when last
thread group is destroyed.
Note 1: threadpool code doesn't surround atomic ops with atomic locks,
thus no locks for shutdown_group_count.
Note 2: this patch preserves old behaviour, but we may need to wait
until all thread groups are down before returning from tp_end().
- thread_pool_size command line option upper limit increased to 100 000
(same as for max_connections)
- thread_pool_size system variable upper limit is maximum of 128 or
the value given at command line
- thread groups are now allocated dynamically
Different limit for command line option and system variable was done to
avoid additional mutex for all_groups and threadpool_max_size.
Assertion happens in replication thread during THD destruction, when thread calls my_sync(), which in turn calls thd_wait_begin() callback. Connection count can be 0, because the counter was decremented before THD destructor.
This assertion currently reproducible only in Percona server and not in MariaDB, due to differences in replication code.
Fixed by moving code to decrement connection counter after the THD destructor.
Use post_kill_notification in for one_thread_per_connection scheduler,
the same as already used in threadpool, to reliably wake a thread stuck in
read() or in different poll() variations.
To allow it, change minimum of thread_pool_stall_limit to be 10 milliseconds.
Also introduce a new parameter to oversubscribe a group . Number of threads running in parallel would be higher than it normally should, leading to thrashing, but it may improving preemptiveness, which is useful for the described corner case.