[PATCH] BUG#30379 Better randomise time before retry in timeout check (DBTC)

timoOutLoopStartLab() checks if any transactions have been delayed
for so long that we are forced to perform some action (e.g. abort,
resend etc).

It is *MEANT* to (according to the comment):
> To avoid aborting both transactions in a deadlock detected by time-out
> we insert a random extra time-out of upto 630 ms by using the lowest
> six bits of the api connect reference.
> We spread it out from 0 to 630 ms if base time-out is larger than 3 sec,
> we spread it out from 0 to 70 ms if base time-out is smaller than 300 msec,
> and otherwise we spread it out 310 ms.

The comment (as all do) lies.

the API connect reference is not very random, producing incredibly
predictable "random" numbers. This could lead to both txns being
aborted instead of just one.

Before:
timeout value: 123 3
timeout value: 122 2
timeout value: 122 2
timeout value: 122 2
timeout value: 123 3

After:
timeout value: 127 7
timeout value: 126 6
timeout value: 129 9
timeout value: 139 19
timeout value: 137 17
timeout value: 151 31
timeout value: 130 10
timeout value: 132 12

Index: ndb-work/ndb/src/kernel/blocks/dbtc/DbtcMain.cpp
===================================================================
This commit is contained in:
stewart@flamingspork.com[stewart] 2007-09-25 12:01:23 +02:00
parent f2159e2d21
commit 33412d2b8e
4 changed files with 78 additions and 2 deletions

View file

@ -0,0 +1,33 @@
/* Copyright (C) 2003 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#ifndef NDB_RAND_H
#define NDB_RAND_H
#define NDB_RAND_MAX 32767
#ifdef __cplusplus
extern "C" {
#endif
int ndb_rand(void);
void ndb_srand(unsigned seed);
#ifdef __cplusplus
}
#endif
#endif

View file

@ -24,7 +24,8 @@ libgeneral_la_SOURCES = \
uucode.c random.c version.c \
strdup.c \
ConfigValues.cpp ndb_init.c basestring_vsnprintf.c \
Bitmask.cpp
Bitmask.cpp \
ndb_rand.c
EXTRA_PROGRAMS = testBitmask
testBitmask_SOURCES = testBitmask.cpp

View file

@ -0,0 +1,40 @@
/* Copyright (C) 2003 MySQL AB
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */
#include <ndb_rand.h>
static unsigned long next= 1;
/**
* ndb_rand
*
* constant time, cheap, pseudo-random number generator.
*
* NDB_RAND_MAX assumed to be 32767
*
* This is the POSIX example for "generating the same sequence on
* different machines". Although that is not one of our requirements.
*/
int ndb_rand(void)
{
next= next * 1103515245 + 12345;
return((unsigned)(next/65536) % 32768);
}
void ndb_srand(unsigned seed)
{
next= seed;
}

View file

@ -20,6 +20,7 @@
#include <RefConvert.hpp>
#include <ndb_limits.h>
#include <my_sys.h>
#include <ndb_rand.h>
#include <signaldata/EventReport.hpp>
#include <signaldata/TcKeyReq.hpp>
@ -6278,7 +6279,8 @@ void Dbtc::timeOutLoopStartLab(Signal* signal, Uint32 api_con_ptr)
jam();
if (api_timer != 0) {
Uint32 error= ZTIME_OUT_ERROR;
time_out_value= time_out_param + (api_con_ptr & mask_value);
time_out_value= time_out_param + (ndb_rand() & mask_value);
ndbout_c("timeout value: %u %u",time_out_value, time_out_value-time_out_param);
if (unlikely(old_mask_value)) // abort during single user mode
{
apiConnectptr.i = api_con_ptr;