Commit graph

30 commits

Author SHA1 Message Date
Oleksandr Byelkin
1640c9b06e Merge branch '11.2' into 11.4 2024-08-04 17:27:48 +02:00
Marko Mäkelä
232d7a5e2d MDEV-34565: SIGILL due to OS not supporting AVX512
It is not sufficient to check that the CPU supports the necessary
instructions. Also the operating system (or virtual machine hypervisor)
must enable all the AVX registers to be saved and restored on a
context switch.

Because clang 8 does not support the compiler intrinsic _xgetbv()
we will require clang 9 or later for enabling the use of VPCLMULQDQ
and the related AVX512 features.
2024-07-29 10:58:09 +03:00
Alexander Barkov
5fb07d942b Merge remote-tracking branch 'origin/11.2' into 11.4 2024-07-09 21:45:37 +04:00
Anson Chung
215fab68db Perform simple fixes for cppcheck findings
Rectify cases of mismatched brackets and address
possible cases of division by zero by checking if
the denominator is zero before dividing.

No functional changes were made.

All new code of the whole pull request, including one or several
files that are either new files or modified ones, are contributed
under the BSD-new license. I am contributing on behalf of my
employer Amazon Web Services, Inc.
2024-07-08 10:51:48 +01:00
Marko Mäkelä
72ceae7314 MDEV-34510: UBSAN: overflow on adding an unsigned offset
crc32_avx512(): Explicitly cast ssize_t(size) to make it clear that
we are indeed applying a negative offset to a pointer.
2024-07-08 18:17:29 +10:00
Alexander Barkov
c4bf4ce948 Merge remote-tracking branch 'origin/11.2' into 11.4 2024-06-17 15:46:39 +04:00
Marko Mäkelä
d9d0e8fd60 MDEV-34321: call to crc32c_3way through pointer to incorrect function type
In commit 9ec7819c58 the CRC-32 function
signatures had been unified somewhat, but not enough.

clang -fsanitize=undefined would flag a function pointer signature
mismatch between const char* and const void*, but not between
uint32_t and unsigned. We try to fix both inconsistencies anyway.

Reviewed by: Vladislav Vaintroub
2024-06-07 13:51:46 +03:00
Oleksandr Byelkin
99b370e023 Merge branch '11.2' into 11.4 2024-05-21 19:38:51 +02:00
Marko Mäkelä
266495b93e MDEV-33817 fixup: Disable for macOS
According to https://discussions.apple.com/thread/8256853
an attempt to use AVX512 registers on macOS will result in #UD
(crash at runtime).

Also, starting with clang-18 and GCC 14, we must add "evex512" to the
target flags so that AVX and SSE instructions can use AVX512 specific
encodings. This flag was introduced together with the avx10.1-512 target.
Older compiler versions do not recognize "evex512". We do not want to
write "avx10.1-512" because it could enable some AVX512 subfeatures
that we do not have any CPUID check for.

Reviewed by: Vladislav Vaintroub
Tested on macOS by: Valerii Kravchuk
2024-05-21 16:45:07 +03:00
Marko Mäkelä
9ea1f67214 MDEV-33817: Proper intrinsics for vextracti32x4 2024-05-12 08:04:06 +03:00
Marko Mäkelä
9ec7819c58 MDEV-33817: AVX512BW and VPCLMULQDQ based CRC-32
This is based on https://github.com/intel/intel-ipsec-mb/
and has been tested both on x86 and x86-64, with code that
was generated by several versions of GCC and clang.
GCC 11 or clang 8 or later should be able to compile this,
and so should recent versions of MSVC.

Thanks to Intel Corporation for providing access to hardware,
for answering my questions regarding the code, and for
providing the coefficients for the CRC-32C computation.

crc32_avx512(): Compute a reverse polynomial CRC-32 using
precomputed tables and carry-less product, for up to 256 bytes
of unaligned input per loop iteration.

Reviewed by: Vladislav Vaintroub
2024-05-03 15:55:20 +03:00
Marko Mäkelä
611cd6b981 MDEV-33817 preparation: Restructuring and unit tests
In our unit test, let us rely on our own reference
implementation using the reflected
CRC-32 ISO 3309 and CRC-32C polynomials. Let us also
test with various lengths.

Let us refactor the CRC-32 and CRC-32C implementations
so that no special compilation flags will be needed and
that some function call indirection will be avoided.

pmull_supported: Remove. We will have pointers to two separate
functions crc32c_aarch64_pmull() and crc32c_aarch64().
2024-05-03 13:06:13 +03:00
Vladislav Vaintroub
905c3d61e1 MDEV-25870 followup - some Windows ARM64 improvements
- optimize atomic store64/load64 implementation.
- allow CRC32 optimization. Do not allow pmull yet, as this fails like in
  https://stackoverflow.com/questions/54048837/how-to-perform-polynomial-multiplication-using-arm64
2023-09-24 11:20:38 +02:00
Like Ma
97ed3dd6df Remove unused header from crc32c.cc 2022-02-24 19:41:00 +11:00
nia
316a8cebf5 Fix building crc32_arm64 on NetBSD/aarch64
pmull_supported is not necessarily defined before crc32c_aarch64

Signed-off-by: Nia Alarie <nia@NetBSD.org>
2021-07-22 16:41:59 +10:00
Brad Smith
4926498a67 CRC32 on OpenBSD/powerpc64.
closes #1828
2021-05-26 11:17:26 +10:00
Marko Mäkelä
58f184a4cb MDEV-24745 Generic CRC-32C computation wrongly uses SSE4.2 instructions
In commit d25f806d73 (MDEV-22749)
the CRC-32C implementation of MariaDB was broken on some
IA-32 and AMD64 builds, depending on the compiler version and
build options. This was verified for IA-32 on GCC 10.2.1.

Even though we try to identify the SSE4.2 extensions and the
availaibility of the PCLMULQDQ instruction by executing CPUID,
the fall-back code could be generated with extended instructions,
because the entire file mysys/crc32/crc32c.c was being compiled
with -msse4.2 -mpclmul. This would cause SIGILL on a PINSRD
instruction on affected IA-32 targets (such as some Intel Atom
processors). This might also affect old AMD64 processors
(predating the 2007 Intel Nehalem microarchitecture), if some
compiler chose to emit the offending instructions.

While it is fine to pass a target-specific option to a target-specific
compilation unit (like -mpclmul to a PCLMUL-specific compilation unit),
that is not safe for mixed-architecture compilation units.

For mixed-architecture compilation units, the correct way is to set
target attributes on the target-specific functions.

There does not seem to be a way to pass target attributes to
function template instantiation. Hence, we must replace the
ExtendImpl template with plain functions crc32_sse42() and
crc32_slow().

We will also remove some inconsistency between
my_crc32_implementation() and mysys_namespace::crc32::Choose_Extend().

The function crc32_pclmul_enabled() will be moved to mysys/crc32/crc32c.cc
so that the detection code will be compiled without -msse4.2 -mpclmul.

The AMD64 PCLMUL accelerated crc32c_3way() will be moved to a new
file crc32c_amd64.cc. In this way, only a few functions that depend
on -msse4.2 in mysys/crc32/crc32c.cc can be declared with
__attribute__((target("sse4.2"))), and most of the file can be compiled
for the generic target.

Last, the file mysys/crc32ieee.cc will be omitted on 64-bit POWER,
because it was dead code (no symbols were exported).

Reviewed by: Vladislav Vaintroub
2021-04-13 16:15:15 +03:00
Daniel Black
5dbea46cfd crc32c: Fix AIX compulation - ALIGN defined
ALIGN was defined already:

mysys/crc32/crc32c.cc:390: warning: "ALIGN" redefined
 #define ALIGN(n, m)     ((n + ((1 << m) - 1)) & ~((1 << m) - 1))

In file included from /root/aix/build/include/my_global.h:543,
                 from /root/aix/build/mysys/crc32/crc32c.cc:22:
/opt/freeware/lib/gcc/powerpc-ibm-aix7.1.0.0/8/include-fixed/sys/socket.h:788: note: this is the location of the previous definition
 #define ALIGN(p)                (ulong)((caddr_t)(p) + MACHINE_ALIGNMENT - 1 - \
2021-03-18 14:40:54 +11:00
Etienne Guesnet
60d1461a28 CRC32 on AIX 2021-03-18 14:40:54 +11:00
David CARLIER
b1241585b2
Mac M1 build support proposal (minus rocksdb option) (#1743) 2021-01-30 17:04:27 +02:00
David Carlier
d79141d641 msys: detects crc/cryptosupport on FreeBSD/arm
closes #1737

Reviewers: Marko Mäkelä, Krunal Bauskar
2021-01-15 18:49:43 +11:00
pkubaj
dc3681e5ed Implement CPU feature checks for FreeBSD/powerpc64
Fixes build on powerpc64 and powerpc64le.

Closes: #1710
2021-01-15 15:40:10 +11:00
Monty
1555c6d125 Fixed compiler warnings from crc32c.cc 2020-11-26 19:13:37 +02:00
Vladislav Vaintroub
ccbe6bb6fc MDEV-19935 Create unified CRC-32 interface
Add CRC32C code to mysys. The x86-64 implementation uses PCMULQDQ in addition to CRC32 instruction
after Intel whitepaper, and is ported from rocksdb code.

Optimized ARM and POWER CRC32 were already present in mysys.
2020-09-17 16:07:37 +02:00
Vladislav Vaintroub
30ff616403 MDEV-23680 Assertion `data' failed in crcr32_calc_pclmulqdq
Fix DBUG_ASSERT
2020-09-07 12:08:26 +02:00
Vladislav Vaintroub
d25f806d73 MDEV-22749 Implement portable PCLMUL accelerated crc32() with Intel intrinsics
Removed some inine assembly, replaced by code from
https://github.com/intel/soft-crc

Also,replace GCC inline assembly for cpuid in ut0crc32 with __cpuid,
to fix "PIC register clobbered by 'ebx' in 'asm'.
This enables fast CRC32C on 32bit Intel processors with GCC.
2020-09-04 23:07:49 +02:00
Marko Mäkelä
b47d61d04f MDEV-23585: Fix HAVE_CLMUL_INSTRUCTION
MDEV-22641 in commit dec3f8ca69
refactored a SIMD implementation of CRC-32 for the ISO 3309 polynomial
that uses the IA-32/AMD64 carry-less multiplication (pclmul)
instructions. The code was previously only available in Mariabackup;
it was changed to be a general replacement of the zlib crc32().

There exist AMD64 systems where CMAKE_SYSTEM_PROCESSOR matches
the pattern i[36]86 but not x86_64 or amd64. This would cause a
link failure, because mysys/checksum.c would basically assume that
the compiler support for instruction is always available on GCC-compatible
compilers on AMD64.

Furthermore, we were unnecessarily disabling the SIMD acceleration
for 32-bit executables.

Note: Until MDEV-22749 has been implemented, the PCLMUL instruction
will not be used on Microsoft Windows.

Closes: #1660
2020-08-27 09:34:53 +03:00
Yuqi Gu
151fc0ed88 MDEV-23495: Refine Arm64 PMULL runtime check in MariaDB
Raspberry Pi 4 supports crc32 but doesn't support pmull (MDEV-23030).

The PR #1645 offers a solution to fix this issue. But it does not consider
the condition that the target platform does support crc32 but not support PMULL.

In this condition, it should leverage the Arm64 crc32 instruction (__crc32c) and
just only skip parallel computation (pmull/vmull) rather than skip all hardware
crc32 instruction of computation.

The PR also removes unnecessary CRC32_ZERO branch in 'crc32c_aarch64' for MariaDB,
formats the indent and coding style.

Change-Id: I76371a6bd767b4985600e8cca10983d71b7e9459
Signed-off-by: Yuqi Gu <yuqi.gu@arm.com>
2020-08-21 20:41:35 +03:00
Krunal Bauskar
c69520c9df MDEV-23030: ARM crash on Raspberry Pi 4
MariaDB adopted a hardware optimized crc32c approach on ARM64 starting 10.5.
Said implementation of crc32c needs support from target hardware for crc32
and pmull instructions. Existing logic is checking only for crc32 support
from target hardware through a runtime check and so if target hardware
doesn't support pmull it would cause things to fail/crash.

Expanded runtime check to ensure pmull support is also checked on the target
hardware along with existing crc32.

Thanks to Marko and Daniel for review.
2020-07-30 15:44:54 +03:00
mysqlonarm
dec3f8ca69
MDEV-22641: Provide SIMD optimized wrapper for zlib crc32() (#1558)
Existing implementation used my_checksum (from mysys)
for calculating table checksum and binlog checksum.

This implementation was optimized for powerpc only and lacked
SIMD implementation for x86 (using clmul) and ARM
(using ACLE) instead used zlib-crc32.

mariabackup had its own copy of the crc32 implementation
using hardware optimized implementation only for x86 and lagged
hardware based implementation for powerpc and ARM.

Patch helps unifies all such calls and help aggregate all of them
using an unified interface my_checksum().

Said unification also enables hardware optimized calls for all
architecture viz. x86, ARM, POWERPC.
Default always fallback to zlib crc32.

Thanks to Daniel Black for reviewing, fixing and testing
PowerPC changes. Thanks to Marko and Daniel for early code feedback.
2020-06-01 11:34:06 +03:00