When calculating next wakeup timepoint for the timer thread, with large
thread_pool_stall_limit values, 32bit int overflow can happen.
Fixed by making one operand 8 byte large.
Also fixed the type of tick_interval to be unsigned, so it does not
go negative for very thread_pool_stall_limit.
This patch is the result of running
run-clang-tidy -fix -header-filter=.* -checks='-*,modernize-use-equals-default' .
Code style changes have been done on top. The result of this change
leads to the following improvements:
1. Binary size reduction.
* For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by
~400kb.
* A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb.
2. Compiler can better understand the intent of the code, thus it leads
to more optimization possibilities. Additionally it enabled detecting
unused variables that had an empty default constructor but not marked
so explicitly.
Particular change required following this patch in sql/opt_range.cc
result_keys, an unused template class Bitmap now correctly issues
unused variable warnings.
Setting Bitmap template class constructor to default allows the compiler
to identify that there are no side-effects when instantiating the class.
Previously the compiler could not issue the warning as it assumed Bitmap
class (being a template) would not be performing a NO-OP for its default
constructor. This prevented the "unused variable warning".
If --thread-pool-dedicated-listener is set, worker should not pick up
events. Dedicated listener constantly drains all events, thus polling
also from another thread makes no sense.
The performance_schema data related to instrument "wait/synch/mutex/threadpool/group_mutex" was incorrect. The root cause is the group_mutex was initialized in thread_group_init before registerd in PSI_register(mutex).
The fix is to put PSI_register before thread_group_init, which ensures the right order.
In NetBSD 9.x and prior, udata is an intptr_t, but in 10.x (current
development branch) it was changed to be a void * for compatibility
with other BSDs a year or so ago.
Unfortunately, this does not simplify the code, as NetBSD 8.x and 9.x
are still supported and will be for a few more years.
Signed-off-by: Nia Alarie <nia@NetBSD.org>
Due to restricted size of the threadpool, execution of client queries can
be delayed (queued) for a while. This delay was interpreted as client
inactivity, and connection is closed, if client idle time + queue time
exceeds wait_timeout.
But users did not expect queue time to be included into wait_timeout.
This patch changes the behavior. We don't close connection anymore,
if there is some unread data present on connection,
even if wait_timeout is exceeded. Unread data means that client
was not idle, it sent a query, which we did not have time to process yet.
This patch reduces the overhead of system calls prior to a query, for
threadpool. Previously, 3 system calls were done
1. WSARecv() to get notification of input data from client, asynchronous
equivalent of select() in one-thread-per-connection
2. recv(4 bytes) - reading packet header length
3. recv(packet payload)
Now there will be usually, just WSARecv(), which pre-reads user data into
a buffer, so we spared 2 syscalls
Profiler shows the most expensive call WSARecv(16%CPU) becomes 4% CPU,
after the patch, benchmark results (network heavy ones like point-select)
improve by ~20%
The buffer management was rather carefully done to keep
buffers together, as Windows would keeps the pages pinned
in memory for the duration of async calls.
At most 1MB memory is used for the buffers, and overhead per-connection is
only 256 bytes, which should cover most of the uses.
SSL does not yet use the optmization, so far it does not properly use
VIO for reads and writes. Neither one-thread-per-connection would get any
benefit, but that should be fine, it is not even default on Windows.
Fixed the condition for waking up/creating another thread.
If there is some work to do (if the request queue is not empty),
a thread should be woken or created.
The condition was incorrect since 18c9b345b4
stalls etc better.
- thread_pool_exact_stats - uses high precision timestamp for
the time when connection was added to the queue. This timestamp helps
calculating queuing time shown in I_S.THREADPOOL_QUEUES entries.
- If thread_pool_dedicated_listener is on, then each group will have its
own dedicated listener, that does not convert to worker.
With this variable on, the queueing time in I_S.THREADPOOL_QUEUES , and
actual queue size in I_S.THREADPOOOL_GROUPS will be more exact, since
IO request are immediately dequeued from poll, without delay.
Part of MDEV-19313.