Make TokuDB report row lock waits with thd_rpl_deadlock_check(). This allows
parallel replication to properly detect conflicts, and kill and retry the
offending transaction.
Merge into the MariaDB tree the pull request from Rich Prohaska for
PerconaFT. These changes are needed to get parallel replication to
work with TokuDB. Once the pull request is accepted by Percona and the new upstream version enters MariaDB, this commit can be superseded.
Original commit message from Rich Prohaska:
1. Fix the release before wait race
The release before wait race occurs when a lock is released by transaction A after transaction B tried to acquire it but before transaction B has a chance to register it's pending lock request. There are several ways to fix this problem, but we want to optimize for the common situation of minimal lock conflicts, which is what the lock acquisition algorithm currently does. Our solution to the release before wait race is for transaction B to retry its lock request after its lock request has been added to the pending lock set.
2. Fix the retry race
The retry race occurs in the current lock retry algorithm which assumes that if some transaction is running lock retry, then my transaction does not also need to run it. There is a chance that some pending lock requests will be skipped, but these lock requests will eventually time out. For applications with small numbers of concurrent transactions, timeouts will frequently occur, and the application throughput will be very small.
The solution to the retry race is to use a group retry algorithm. All threads run through the retry logic. Sequence numbers are used to group retries into batches such that one transaction can run the retry logic on behalf of several transactions. This amortizes the retry cost. The sequence numbers also ensure that when a transaction releases its locks, all of the pending lock requests that it is blocking are retried.
3. Implement a mechanism to find and kill a pending lock request
Tags lock requests with a client id, use the client id as a key into the pending lock requests sets to find a lock request, complete the lock request with a lock timeout error.
Copyright (c) 2016, Rich Prohaska
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The following directives to ignore warnings where in the PerconaFT build in tokudb.
These generate errors when g++ ... -o xxx.so is used to compile are shared object.
As these don't actually hit any warnings they have been removed.
* -Wno-ignored-attributes
* -Wno-pointer-bool-conversion
Signed-off-by: Daniel Black <daniel.black@au.ibm.com>
mincore is defined differently in BSD mincore(void *, size_t, char *) vs
linux variant of: mincore(void *, size_t, unsigned char *).
Account for this difference in TokuDB.
Linking tokudb with jemalloc privately causes problems on library
load/unload. To prevent dangling destructor pointers, link with the same
library as the server is using.
mincore is defined differently in BSD mincore(void *, size_t, char *) vs
linux variant of: mincore(void *, size_t, unsigned char *).
Account for this difference in TokuDB.