Commit graph

1628 commits

Author SHA1 Message Date
Marko Mäkelä
e55397a46d Merge 10.5 into 10.6 2022-12-05 18:04:23 +02:00
Jan Lindström
4eb8e51c26 Merge 10.4 into 10.5 2022-11-30 13:10:52 +02:00
anson1014
966d22b715
Ensure that source files contain only valid UTF8 encodings (#2188)
Modern software (including text editors, static analysis software,
and web-based code review interfaces) often requires source code files
to be interpretable via a consistent character encoding, with UTF-8 or
ASCII (a strict subset of UTF-8) as the default. Several of the MariaDB
source files contain bytes that are not valid in either the UTF-8 or
ASCII encodings, but instead represent strings encoded in the
ISO-8859-1/Latin-1 or ISO-8859-2/Latin-2 encodings.

These inconsistent encodings may prevent software from correctly
presenting or processing such files. Converting all source files to
valid UTF8 characters will ensure correct handling.

Comments written in Czech were replaced with lightly-corrected
translations from Google Translate. Additionally, comments describing
the proper handling of special characters were changed so that the
comments are now purely UTF8.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer
Amazon Web Services, Inc.

Co-authored-by: Andrew Hutchings <andrew@linuxjedi.co.uk>
2022-08-30 09:21:40 +01:00
Marko Mäkelä
30914389fe Merge 10.5 into 10.6 2022-07-27 17:52:37 +03:00
Marko Mäkelä
098c0f2634 Merge 10.4 into 10.5 2022-07-27 17:17:24 +03:00
Oleksandr Byelkin
3bb36e9495 Merge branch '10.3' into 10.4 2022-07-27 11:02:57 +02:00
Rucha Deodhar
dbe39f14fe MDEV-28762: recursive call of some json functions without stack control
Analysis: Some recursive json functions dont check for stack control
Fix: Add check_stack_overrun(). The last argument is NULL because it is not
used
2022-07-20 19:24:48 +05:30
Sergei Golubchik
3bc98a4ec4 Merge branch '10.5' into 10.6 2022-05-10 14:01:23 +02:00
Sergei Golubchik
ef781162ff Merge branch '10.4' into 10.5 2022-05-09 22:04:06 +02:00
Alexander Barkov
0e4bc67eab 10.4 specific changes for "MDEV-27494 Rename .ic files to .inl"
Renaming ctype-uca.ic to ctype-uca.inl

This file was introduced in 10.4,
so it did not get to the main 10.2 patch for MDEV-27494
2022-04-28 10:52:11 +04:00
Marko Mäkelä
b242c3141f Merge 10.5 into 10.6 2022-03-29 16:16:21 +03:00
Marko Mäkelä
d62b0368ca Merge 10.4 into 10.5 2022-03-29 12:59:18 +03:00
Marko Mäkelä
ae6e214fd8 Merge 10.3 into 10.4 2022-03-29 11:13:18 +03:00
Marko Mäkelä
020e7d89eb Merge 10.2 into 10.3 2022-03-29 09:53:15 +03:00
Alexey Botchkov
6277e7df6b MDEV-22742 UBSAN: Many overflow issues in strings/decimal.c - runtime error: signed integer overflow: x * y cannot be represented in type 'long long int' (on optimized builds).
Avoid integer overflow, do the check before the calculation.
2022-03-21 15:05:42 +04:00
Marko Mäkelä
572e34304e Merge 10.5 into 10.6 2022-03-14 10:59:46 +02:00
Marko Mäkelä
59359fb44a MDEV-24841 Build error with MSAN use-of-uninitialized-value in comp_err
The MemorySanitizer implementation in clang includes some built-in
instrumentation (interceptors) for GNU libc. In GNU libc 2.33, the
interface to the stat() family of functions was changed. Until the
MemorySanitizer interceptors are adjusted, any MSAN code builds
will act as if that the stat() family of functions failed to initialize
the struct stat.

A fix was applied in
https://reviews.llvm.org/rG4e1a6c07052b466a2a1cd0c3ff150e4e89a6d87a
but it fails to cover the 64-bit variants of the calls.

For now, let us work around the MemorySanitizer bug by defining
and using the macro MSAN_STAT_WORKAROUND().
2022-03-14 09:28:55 +02:00
Oleksandr Byelkin
f5c5f8e41e Merge branch '10.5' into 10.6 2022-02-03 17:01:31 +01:00
Oleksandr Byelkin
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
Sergei Golubchik
bc10f58a58 MDEV-24909 JSON functions don't respect KILL QUERY / max_statement_time limit
pass the pointer to thd->killed down to the json library,
check it while scanning,
use thd->check_killed() to generate the proper error message
2022-01-30 12:07:31 +01:00
Oleksandr Byelkin
a576a1cea5 Merge branch '10.3' into 10.4 2022-01-30 09:46:52 +01:00
Oleksandr Byelkin
41a163ac5c Merge branch '10.2' into 10.3 2022-01-29 15:41:05 +01:00
Alexander Barkov
b915f79e4e MDEV-25904 New collation functions to compare InnoDB style trimmed NO PAD strings 2022-01-21 12:16:07 +04:00
Alexander Barkov
47463e5796 MDEV-27552 Change the return type of my_uca_context_weight_find() to MY_CONTRACTION* 2022-01-20 15:44:13 +04:00
Vladislav Vaintroub
47e18af906 MDEV-27494 Rename .ic files to .inl 2022-01-17 16:41:51 +01:00
Marko Mäkelä
25ac047baf Merge 10.5 into 10.6 2021-11-09 09:11:50 +02:00
Marko Mäkelä
9c18b96603 Merge 10.4 into 10.5 2021-11-09 08:50:33 +02:00
Marko Mäkelä
47ab793d71 Merge 10.3 into 10.4 2021-11-09 08:40:14 +02:00
Marko Mäkelä
524b4a89da Merge 10.2 into 10.3 2021-11-09 08:26:59 +02:00
Alexander Barkov
d0b611a76d MDEV-24335 Unexpected question mark in the end of a TINYTEXT column
my_copy_fix_mb() passed MIN(src_length,dst_length) to
my_append_fix_badly_formed_tail(). It could break a multi-byte
character in the middle, which put the question mark to the
destination.

Fixing the code to pass the true src_length to
my_append_fix_badly_formed_tail().
2021-11-02 09:00:49 +04:00
Alexander Barkov
059797ed44 MDEV-24901 SIGSEGV in fts_get_table_name, SIGSEGV in ib_vector_size, SIGSEGV in row_merge_fts_doc_tokenize, stack smashing
strmake() puts one extra 0x00 byte at the end of the string.
The code in my_strnxfrm_tis620[_nopad] did not take this into
account, so in the reported scenario the 0x00 byte was put outside
of a stack variable, which made ASAN crash.

This problem is already fixed in in MySQL:

  commit 19bd66fe43c41f0bde5f36bc6b455a46693069fb
  Author: bin.x.su@oracle.com <>
  Date:   Fri Apr 4 11:35:27 2014 +0800

But the fix does not seem to be correct, as it breaks when finds a zero byte
in the source string.

Using memcpy() instead of strmake().

- Unlike strmake(), memcpy() it does not write beyond the destination
  size passed.
- Unlike the MySQL fix, memcpy() does not break on the first 0x00 byte found
  in the source string.
2021-10-29 12:37:29 +04:00
Alexander Barkov
0d68b0a2d6 MDEV-26669 Add MY_COLLATION_HANDLER functions min_str() and max_str() 2021-09-27 17:10:22 +04:00
Vladislav Vaintroub
3d6eb7afcf MDEV-25602 get rid of __WIN__ in favor of standard _WIN32
This fixed the MySQL bug# 20338 about misuse of double underscore
prefix __WIN__, which was old MySQL's idea of identifying Windows
Replace it by _WIN32 standard symbol for targeting Windows OS
(both 32 and 64 bit)

Not that connect storage engine is not fixed in this patch (must be
fixed in "upstream" branch)
2021-06-06 13:21:03 +02:00
Monty
a206658b98 Change CHARSET_INFO character set and collaction names to LEX_CSTRING
This change removed 68 explict strlen() calls from the code.

The following renames was done to ensure we don't use the old names
when merging code from earlier releases, as using the new variables
for print function could result in crashes:
- charset->csname renamed to charset->cs_name
- charset->name renamed to charset->coll_name

Almost everything where mechanical changes except:
- Changed to use the new Protocol::store(LEX_CSTRING..) when possible
- Changed to use field->store(LEX_CSTRING*, CHARSET_INFO*) when possible
- Changed to use String->append(LEX_CSTRING&) when possible

Other things:
- There where compiler issues with ensuring that all character set names
  points to the same string: gcc doesn't allow one to use integer constants
  when defining global structures (constant char * pointers works fine).
  To get around this, I declared defines for each character set name
  length.
2021-05-19 22:54:07 +02:00
Monty
5c7d243b29 Add support for minimum field width for strings to my_vsnprintf()
This patch adds support for right aligned strings and numbers.
Left alignment is left as an exercise for anyone needing it.

MDEV-25612 "Assertion `to <= end' failed in process_args" fixed.
(Was caused by the original version of this patch)
2021-05-19 22:27:29 +02:00
Monty
fa7d4abf16 Added typedef decimal_digits_t (uint16) for number of digits in most
aspects of decimals and integers

For fields and Item's uint8 should be good enough. After
discussions with Alexander Barkov we choose uint16 (for now)
as some format functions may accept +256 digits.

The reason for this patch was to make the usage and storage of decimal
digits simlar. Before this patch decimals was stored/used as uint8,
int and uint.  The lengths for numbers where also using a lot of
different types.

Changed most decimal variables and functions to use the new typedef.

squash! af7f09106b6c1dc20ae8c480bff6fd22d266b184

Use decimal_digits_t for all aspects of digits (total, precision
and scale), both for decimals and integers.
2021-05-19 22:27:27 +02:00
Rucha Deodhar
2fdb556e04 MDEV-8334: Rename utf8 to utf8mb3
This patch changes the main name of 3 byte character set from utf8 to
utf8mb3. New old_mode UTF8_IS_UTF8MB3 is added and set TRUE by default,
so that utf8 would mean utf8mb3. If not set, utf8 would mean utf8mb4.
2021-05-19 06:48:36 +02:00
Monty
4d53a7585c Updated tests in decimal.c that match current API and usage 2021-05-11 21:25:08 +03:00
Marko Mäkelä
80ed136e6d Merge 10.4 into 10.5 2021-04-21 09:01:01 +03:00
Monty
031f11717d Fix all warnings given by UBSAN
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.

The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
  complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
  memory access of integers.  Fixed by using byte_order_generic.h when
  compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
  disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
  suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
  safe to have overflows (two cases, in item_func.cc).

Things fixed:
- Don't left shift signed values
  (byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
  constructors.  This was needed as UBSAN checks that these types has
  correct values when one copies an object.
  (gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
  deleted objects.
  (events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
  on Query_arena object.
- Fixed several cast of objects to an incompatible class!
  (Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
   sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
  This includes also ++ and -- of integers.
  (Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
  value_type is initialized to this instead of to -1, which is not a valid
  enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
  instead of a null string (safer as it ensures we do not do arithmetic
  on null strings).

Other things:

- Changed struct st_position to an OBJECT and added an initialization
  function to it to ensure that we do not copy or use uninitialized
  members. The change to a class was also motived that we used "struct
  st_position" and POSITION randomly trough the code which was
  confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
  the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
  avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr.  (This variable was before
  only in 10.5 and up).  It can now have one of two values:
  ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
  it virtual. This was an effort to get UBSAN to work with loaded storage
  engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
  in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
  server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
  to integer arithmetic.

Changes that should not be needed but had to be done to suppress warnings
from UBSAN:

- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
  compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
  some compile time warnings.

Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
2021-04-20 12:30:09 +03:00
Sergei Golubchik
f33e57a9e6 Merge branch '10.4' into 10.5 2021-02-23 13:06:22 +01:00
Sergei Golubchik
e841957416 Merge branch '10.3' into 10.4 2021-02-23 09:25:57 +01:00
Sergei Golubchik
0ab1e3914c Merge branch '10.2' into 10.3 2021-02-22 22:42:27 +01:00
Sergei Golubchik
25d9d2e37f Merge branch 'bb-10.4-release' into bb-10.5-release 2021-02-15 16:43:15 +01:00
Sergei Golubchik
00a313ecf3 Merge branch 'bb-10.3-release' into bb-10.4-release
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately
2021-02-12 17:44:22 +01:00
Alexander Barkov
afc5bac49d MDEV-24790 CAST('0e1111111111' AS DECIMAL(38,0)) returns a wrong result 2021-02-08 16:19:45 +04:00
Sergei Golubchik
60ea09eae6 Merge branch '10.2' into 10.3 2021-02-01 13:49:33 +01:00
Daniel Black
f2fea295b4 ucs2: cppcheck - add va_end 2021-01-21 16:46:59 +11:00
Marko Mäkelä
133b4b46fe Merge 10.4 into 10.5 2020-11-03 16:24:47 +02:00
Marko Mäkelä
533a13af06 Merge 10.3 into 10.4 2020-11-03 14:49:17 +02:00