MDEV-8684: UT_RELAX_CPU on Power to non-empty expansion

Using __ppc_get_timebase will translate to mfspr instruction
The mfspr instruction will block FXU1 until complete but the other
Pipelines are available for execution of instructions from other
SMT threads on the same core.

The latency time to read the timebase SPR is ~10 cycles.

So any impact on other threads is limited other FXU1 only instructions
(basically other mfspr/mtspr ops).

Suggested by Steven J. Munroe, Linux on Power Toolchain Architect,
Linux Technology Center
IBM Corporation
This commit is contained in:
Daniel Black 2016-03-30 15:09:52 +11:00
parent 3d1a7cba71
commit 64824a760d
2 changed files with 10 additions and 0 deletions

View file

@ -88,6 +88,11 @@ private:
the YieldProcessor macro defined in WinNT.h. It is a CPU architecture-
independent way by using YieldProcessor. */
# define UT_RELAX_CPU() YieldProcessor()
# elif defined(__powerpc__)
#include <sys/platform/ppc.h>
# define UT_RELAX_CPU() do { \
volatile lint volatile_var = __ppc_get_timebase(); \
} while (0)
# else
# define UT_RELAX_CPU() ((void)0) /* avoid warning for an empty statement */
# endif

View file

@ -85,6 +85,11 @@ private:
the YieldProcessor macro defined in WinNT.h. It is a CPU architecture-
independent way by using YieldProcessor. */
# define UT_RELAX_CPU() YieldProcessor()
# elif defined(__powerpc__)
#include <sys/platform/ppc.h>
# define UT_RELAX_CPU() do { \
volatile lint volatile_var = __ppc_get_timebase(); \
} while (0)
# else
# define UT_RELAX_CPU() ((void)0) /* avoid warning for an empty statement */
# endif