mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-16 03:52:35 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	4b291588bb	MDEV-19845: Make my_cpu.h self-contained Fix up commit `f5c080c735`	2020-02-01 14:56:05 +02:00
Marko Mäkelä	7a3d34d645	Merge 10.3 into 10.4	2019-07-02 21:44:58 +03:00
Marko Mäkelä	709f0510e3	MDEV-19845: Adjust for Skylake based on benchmarks Even though the PAUSE instruction latency was increased from about 10 to 140 clock cycles in the Intel Skylake microarchitecture, it seems to be optimal to reduce the amount of subsequently executed PAUSE instructions not to 1/14, but to 1/2.	2019-07-02 17:44:05 +03:00
Marko Mäkelä	5e929ee8a0	MDEV-19845: Define my_timer_cycles() inline On clang, use __builtin_readcyclecounter() when available. Hinted by Sergey Vojtovich. (This may lead to runtime failure on ARM systems. The hardware should be available on ARMv8 (AArch64), but access to it may require special privileges.) We remove support for the proprietary Sun Microsystems compiler, and rely on clang or the __GNUC__ assembler syntax instead. For now, we retain support for IA-64 (Itanium) and 32-bit SPARC, even though those platforms are likely no longer widely used. We remove support for clock_gettime(CLOCK_SGI_CYCLE), because Silicon Graphics ceased supporting IRIX in December 2013. This was the only cycle timer interface available for MIPS. On PowerPC, we rely on the GCC 4.8 __builtin_ppc_get_timebase() (or clang __builtin_readcyclecounter()), which should be equivalent to the old assembler code on both 64-bit and 32-bit targets.	2019-06-28 19:19:31 +03:00
Marko Mäkelä	b7b0bc8f11	Merge 10.3 into 10.4 We omit the work-around commit `0b7fa5a05d` because it appears to be needed for CentOS 6 only, which we no longer support.	2019-06-27 17:54:47 +03:00
Marko Mäkelä	f5c080c735	MDEV-19845: Fix the build on some platforms On some platforms, MY_RELAX_CPU() falls back to an atomic memory operation, but my_cpu.h fails to include my_atomic.h.	2019-06-27 15:04:00 +03:00
Marko Mäkelä	0b7fa5a05d	MDEV-19845: Fix the build on some x86 targets The RDTSC instruction, which was introduced in the Intel Pentium, has been used in MariaDB for a long time. But, the __rdtsc() wrapper is not available by default in some x86 build environments. The simplest solution seems to replace the inlined instruction with a call to the wrapper function my_timer_cycles(). The overhead for the call should not affect the measurement threshold. On Windows and on AMD64, we will keep using __rdtsc() directly.	2019-06-27 12:19:51 +03:00
Marko Mäkelä	042fc29597	MDEV-19845: Adaptive spin loops Starting with the Intel Skylake microarchitecture, the PAUSE instruction latency is about 140 clock cycles instead of earlier 10. On AMD processors, the latency could be 10 or 50 clock cycles, depending on microarchitecture. Because of this big range of latency, let us scale the loops around the PAUSE instruction based on timing results at server startup. my_cpu_relax_multiplier: New variable: How many times to invoke PAUSE in a loop. Only defined for IA-32 and AMD64. my_cpu_init(): Determine with RDTSC the time to run 16 PAUSE instructions in two unrolled loops according, and based on the quicker of the two runs, initialize my_cpu_relax_multiplier. This form of calibration was suggested by Mikhail Sinyavin from Intel. LF_BACKOFF(), ut_delay(): Use my_cpu_relax_multiplier when available. ut_delay(): Define inline in my_cpu.h. UT_COMPILER_BARRIER(): Remove. This does not seem to have any effect, because in our ut_delay() implementation, no computations are being performed inside the loop. The purpose of UT_COMPILER_BARRIER() was to prohibit the compiler from reordering computations. It was not emitting any code.	2019-06-27 10:53:18 +03:00

8 commits