detector". This patch addresses performance regression in OLTP_RO/MyISAM
test on Windows introduced by the fix for bug #56405. Thus it makes
original patch acceptable as a solution for bug #56585 "Slowdown of
readonly sysbench benchmarks (e.g point_select) on Windows 5.5".
With this patch, MySQL will use native Windows condition variables and
reader-writer locks if they are supported by the OS.
This speeds up MyISAM and the effect comes mostly from using native
rwlocks. Native conditions improve scalability with higher number of
concurrent users in other situations, e.g for prlocks.
Benchmark numbers for this patch as measured on Win2008R2 quad
core machine are attached to the bug report.
( direct link http://bugs.mysql.com/file.php?id=15883 )
Note that currently we require at least Windows7/WS2008R2 for
reader-writer locks, even though native rwlock is available also on Vista.
Reason is that "trylock" APIs are missing on Vista, and trylock is used in
the server (in a single place in query cache).
While this patch could have been written differently, to enable the native
rwlock optimization also on Vista/WS2008 (e.g using native locks everywhere
but portable implementation in query cache), this would come at the
expense of the code clarity, as it would introduce a new "try-able" rwlock
type, to handle Vista case.
Another way to improve performance for the special case
(OLTP_RO/MYISAM/Vista) would be to eliminate "trylock" usage from server,
but this is outside of the scope here.
Native conditions variables are used beginning with Vista though the effect
of using condition variables alone is not measurable in this benchmark.
But when used together with native rwlocks on Win7, native conditions improve
performance in high-concurrency OLTP_RO/MyISAM (128 and more sysbench
users).
detector" that doesn't introduce bug #56715 "Concurrent
transactions + FLUSH result in sporadical unwarranted
deadlock errors".
Deadlock could have occurred when workload containing a mix
of DML, DDL and FLUSH TABLES statements affecting the same
set of tables was executed in a heavily concurrent environment.
This deadlock occurred when several connections tried to
perform deadlock detection in the metadata locking subsystem.
The first connection started traversing wait-for graph,
encountered a sub-graph representing a wait for flush, acquired
LOCK_open and dived into sub-graph inspection. Then it
encountered sub-graph corresponding to wait for metadata lock
and blocked while trying to acquire a rd-lock on
MDL_lock::m_rwlock, since some,other thread had a wr-lock on it.
When this wr-lock was released it could have happened (if there
was another pending wr-lock against this rwlock) that the rd-lock
from the first connection was left unsatisfied but at the same
time the new rd-lock request from the second connection sneaked
in and was satisfied (for this to be possible the second
rd-request should come exactly after the wr-lock is released but
before pending the wr-lock manages to grab rwlock, which is
possible both on Linux and in our own rwlock implementation).
If this second connection continued traversing the wait-for graph
and encountered a sub-graph representing a wait for flush it tried
to acquire LOCK_open and thus the deadlock was created.
The previous patch tried to workaround this problem by not
allowing the deadlock detector to lock LOCK_open mutex if
some other thread doing deadlock detection already owns it
and current search depth is greater than 0. Instead deadlock
was reported. As a result it has introduced bug #56715.
This patch solves this problem in a different way.
It introduces a new rw_pr_lock_t implementation to be used
by MDL subsystem instead of one based on Linux rwlocks or
our own rwlock implementation. This new implementation
never allows situation in which an rwlock is rd-locked and
there is a blocked pending rd-lock. Thus the situation which
has caused this bug becomes impossible with this implementation.
Due to fact that this implementation is optimized for
wr-lock/unlock scenario which is most common in the MDL
subsystem it doesn't introduce noticeable performance
regressions in sysbench tests. Moreover it significantly
improves situation for POINT_SELECT test when many
connections are used.
No test case is provided as this bug is very hard to repeat
in MTR environment but is repeatable with the help of RQG
tests.
This patch also doesn't include a test for bug #56715
"Concurrent transactions + FLUSH result in sporadical
unwarranted deadlock errors" as it takes too much time to
be run as part of normal test-suite runs.
on Windows".
On platforms where read-write lock implementation does not
prefer readers by default (Windows, Solaris) server might
have deadlocked while detecting MDL deadlock.
MDL deadlock detector relies on the fact that read-write
locks which are used in its implementation prefer readers
(see new comment for MDL_lock::m_rwlock for details).
So far MDL code assumed that default implementation of
read/write locks for the system has this property.
Indeed, this turned out ot be wrong, for example, for
Windows or Solaris. Thus MDL deadlock detector might have
deadlocked on these systems.
This fix simply adds portable implementation of read/write
lock which prefer readers and changes MDL code to use this
new type of synchronization primitive.
No test case is added as existing rqg_mdl_stability test can
serve as one.