The presumed reason for the error is that the file was opened
by 3rd party antivirus or backup program, causing ERROR_SHARING_VIOLATION
on rename.
The fix, actually a workaround, is to retry MoveFileEx couple of times
before finally giving up. We expect 3rd party programs not to hold file
for extended time.
Add CRC32C code to mysys. The x86-64 implementation uses PCMULQDQ in addition to CRC32 instruction
after Intel whitepaper, and is ported from rocksdb code.
Optimized ARM and POWER CRC32 were already present in mysys.
Removed some inine assembly, replaced by code from
https://github.com/intel/soft-crc
Also,replace GCC inline assembly for cpuid in ut0crc32 with __cpuid,
to fix "PIC register clobbered by 'ebx' in 'asm'.
This enables fast CRC32C on 32bit Intel processors with GCC.
Passing a null pointer to a nonnull argument is not only undefined
behaviour, but it also grants the compiler the permission to optimize
away further checks whether the pointer is null. GCC -O2 at least
starting with version 8 may do that, potentially causing SIGSEGV.
These problems were caught in a WITH_UBSAN=ON build with the
Bug#7024 test in main.view.
When MDEV-22669 introduced CRC-32C acceleration to IA-32,
it worked around a compiler bug by disabling the acceleration
on GCC 4 for IA-32 altogether, even though the compiler bug
only affects -fPIC builds that are targeting IA-32.
Let us extend the solution fe5dbfe723
and define HAVE_CPUID_INSTRUCTION that allows us to implement
a necessary and sufficient work-around of the compiler bug.
GCC before version 5 would fail to emit the CPUID instruction
when targeting IA-32 in -fPIC mode. Therefore, we must add the
CPUID instruction to the HAVE_CLMUL_INSTRUCTION check.
This means that the PCLMUL accelerated crc32() function will
not be available on i686 executables that are compiled with
GCC 4. The limitation does not impact AMD64 builds or non-PIC
x86 builds, or other compilers (clang, or GCC 5 or newer).
MDEV-22641 in commit dec3f8ca69
refactored a SIMD implementation of CRC-32 for the ISO 3309 polynomial
that uses the IA-32/AMD64 carry-less multiplication (pclmul)
instructions. The code was previously only available in Mariabackup;
it was changed to be a general replacement of the zlib crc32().
There exist AMD64 systems where CMAKE_SYSTEM_PROCESSOR matches
the pattern i[36]86 but not x86_64 or amd64. This would cause a
link failure, because mysys/checksum.c would basically assume that
the compiler support for instruction is always available on GCC-compatible
compilers on AMD64.
Furthermore, we were unnecessarily disabling the SIMD acceleration
for 32-bit executables.
Note: Until MDEV-22749 has been implemented, the PCLMUL instruction
will not be used on Microsoft Windows.
Closes: #1660
Raspberry Pi 4 supports crc32 but doesn't support pmull (MDEV-23030).
The PR #1645 offers a solution to fix this issue. But it does not consider
the condition that the target platform does support crc32 but not support PMULL.
In this condition, it should leverage the Arm64 crc32 instruction (__crc32c) and
just only skip parallel computation (pmull/vmull) rather than skip all hardware
crc32 instruction of computation.
The PR also removes unnecessary CRC32_ZERO branch in 'crc32c_aarch64' for MariaDB,
formats the indent and coding style.
Change-Id: I76371a6bd767b4985600e8cca10983d71b7e9459
Signed-off-by: Yuqi Gu <yuqi.gu@arm.com>
depending on build config the error might be hidded,
in particular liblz4.so and libjemalloc.so make it to disappear,
but with -DWITH_INNODB_LZ4=NO -DWITH_JEMALLOC=NO it reappears.
MariaDB adopted a hardware optimized crc32c approach on ARM64 starting 10.5.
Said implementation of crc32c needs support from target hardware for crc32
and pmull instructions. Existing logic is checking only for crc32 support
from target hardware through a runtime check and so if target hardware
doesn't support pmull it would cause things to fail/crash.
Expanded runtime check to ensure pmull support is also checked on the target
hardware along with existing crc32.
Thanks to Marko and Daniel for review.
I run perf top during ./mtr testing and constantly see times()
function there. It's so slow, that it has no sense to run it
in a loop too many times.
This patch speeds up -suite=innodb for me from 218s to 208s.
9s of times() function!
Small postfix to MDEV-23175 to ensure faster option on FreeBSD
and compatibility to Solaris that isn't high resolution.
ftime is left as a backup in case an implementation doesn't
contain any of these clocks.
FreeBSD
$ ./unittest/mysys/my_rdtsc-t
1..11
# ----- Routine ---------------
# myt.cycles.routine : 5
# myt.nanoseconds.routine : 11
# myt.microseconds.routine : 13
# myt.milliseconds.routine : 11
# myt.ticks.routine : 17
# ----- Frequency -------------
# myt.cycles.frequency : 3610295566
# myt.nanoseconds.frequency : 1000000000
# myt.microseconds.frequency : 1000000
# myt.milliseconds.frequency : 899
# myt.ticks.frequency : 136
# ----- Resolution ------------
# myt.cycles.resolution : 1
# myt.nanoseconds.resolution : 1
# myt.microseconds.resolution : 1
# myt.milliseconds.resolution : 7
# myt.ticks.resolution : 1
# ----- Overhead --------------
# myt.cycles.overhead : 26
# myt.nanoseconds.overhead : 19140
# myt.microseconds.overhead : 19036
# myt.milliseconds.overhead : 578
# myt.ticks.overhead : 21544
ok 1 - my_timer_init() did not crash
ok 2 - The cycle timer is strictly increasing
ok 3 - The cycle timer is implemented
ok 4 - The nanosecond timer is increasing
ok 5 - The nanosecond timer is implemented
ok 6 - The microsecond timer is increasing
ok 7 - The microsecond timer is implemented
ok 8 - The millisecond timer is increasing
ok 9 - The millisecond timer is implemented
ok 10 - The tick timer is increasing
ok 11 - The tick timer is implemented
Largely based on MySQL commit
75271e51d6
MySQL Ref:
BUG#24566529: BACKPORT BUG#23575445 TO 5.6
(cut)
Also, the PTR_SANE macro which tries to check if a pointer
is invalid (used when printing pointer values in stack traces)
gave false negatives on OSX/FreeBSD. On these platforms we
now simply check if the pointer is non-null. This also removes
a sbrk() deprecation warning when building on OS X. (It was
before only disabled with building using XCode).
Removed execinfo path of MySQL patch that was already included.
sbrk doesn't exist on FreeBSD aarch64.
Removed HAVE_BSS_START based detection and replaced with __linux__
as it doesn't exist on OSX, Solaris or Windows. __bss_start
exists on mutiple Linux architectures.
Tested on FreeBSD and Linux x86_64. Being in FreeBSD ports for 2
years implies a good testing there on all FreeBSD architectures there
too. MySQL-8.0.21 code is functionally identical to original commit.
aarch64 timer is available to userspace via arch register.
clang's __builtin_readcyclecounter is wrong for aarch64 (reads the PMU
cycle counter instead of the archi-timer register), so we don't use it.
my_rdtsc unit-test on AWS m6g shows:
frequency: 121830845
resolution: 1
overhead: 1
This counter is not strictly increasing, but it is non-decreasing.
This patch ensures that all identical character sets shares the same
cs->csname.
This allows us to replace strcmp() in my_charset_same() with comparisons
of pointers. This fixes a long standing performance issue that could cause
as strcmp() for every item sent trough the protocol class to the end user.
One consequence of this patch is that we don't allow one to add a character
definition in the Index.xml file that changes the csname of an existing
character set. This is by design as changing character set names of existing
ones is extremely dangerous, especially as some storage engines just records
character set numbers.
As we now have a hash over character set's csname, we can in the future
use that for faster access to a specific character set. This could be done
by changing the hash to non unique and use the hash to find the next
character set with same csname.
Linux glibc has deprecated ftime resutlting in a compile error on Fedora-32.
Per manual clock_gettime is the suggested replacement. Because my_timer_milliseconds
is a relative time used by largely the perfomrance schema, CLOCK_MONOTONIC_COARSE
is used. This has been available since Linux-2.6.32.
The low overhead is shows in the unittest:
$ unittest/mysys/my_rdtsc-t
1..11
# ----- Routine ---------------
# myt.cycles.routine : 5
# myt.nanoseconds.routine : 11
# myt.microseconds.routine : 13
# myt.milliseconds.routine : 18
# myt.ticks.routine : 17
# ----- Frequency -------------
# myt.cycles.frequency : 3596597014
# myt.nanoseconds.frequency : 1000000000
# myt.microseconds.frequency : 1000000
# myt.milliseconds.frequency : 1039
# myt.ticks.frequency : 103
# ----- Resolution ------------
# myt.cycles.resolution : 1
# myt.nanoseconds.resolution : 1
# myt.microseconds.resolution : 1
# myt.milliseconds.resolution : 1
# myt.ticks.resolution : 1
# ----- Overhead --------------
# myt.cycles.overhead : 118
# myt.nanoseconds.overhead : 234
# myt.microseconds.overhead : 222
# myt.milliseconds.overhead : 30
# myt.ticks.overhead : 4946
ok 1 - my_timer_init() did not crash
ok 2 - The cycle timer is strictly increasing
ok 3 - The cycle timer is implemented
ok 4 - The nanosecond timer is increasing
ok 5 - The nanosecond timer is implemented
ok 6 - The microsecond timer is increasing
ok 7 - The microsecond timer is implemented
ok 8 - The millisecond timer is increasing
ok 9 - The millisecond timer is implemented
ok 10 - The tick timer is increasing
ok 11 - The tick timer is implemented
The merge commit 0fd89a1a89
of commit b6ec1e8bbf
seems to cause occasional MemorySanitizer failures,
because it failed to replace some MEM_UNDEFINED() calls
with MEM_MAKE_ADDRESSABLE().
my_large_free(): Correctly invoke MEM_MAKE_ADDRESSABLE() after
freeing memory. Failure to do so could cause bogus
AddressSanitizer failures for memory allocated by my_large_malloc().
On MemorySanitizer, we will do nothing.
buf_pool_t::chunk_t::create(): Replace the MEM_MAKE_ADDRESSABLE()
that had been added in commit 484931325e
to work around the issue.
In AddressSanitizer, we only want memory poisoning to happen
in connection with custom memory allocation or freeing.
The primary use of MEM_UNDEFINED is for declaring memory uninitialized
in Valgrind or MemorySanitizer. We do not want MEM_UNDEFINED to
have the unwanted side effect that AddressSanitizer would no longer
be able to complain about accessing unallocated memory.
MEM_UNDEFINED(): Define as no-op for AddressSanitizer.
MEM_MAKE_ADDRESSABLE(): Define as MEM_UNDEFINED() or
ASAN_UNPOISON_MEMORY_REGION().
MEM_CHECK_ADDRESSABLE(): Wrap also __asan_region_is_poisoned().