Commit graph

1726 commits

Author SHA1 Message Date
Marko Mäkelä
95d51369c9 Merge 10.10 into 10.11 2023-02-28 10:52:42 +02:00
Marko Mäkelä
f14d9fa09a Merge 10.9 into 10.10 2023-02-28 10:43:29 +02:00
Marko Mäkelä
c3246e4bf0 Merge 10.8 into 10.9 2023-02-28 10:37:11 +02:00
Alexander Barkov
b62123e0d5 MDEV-30716 Wrong casefolding in xxx_unicode_520_ci for U+0700..U+07FF
The array my_unicase_pages_unicode520[7] erroneously mapped to plane06
instead of plane07.
2023-02-23 23:40:45 +04:00
Helmut Grohne
6f6fa3bec2 MDEV-30694: Cross building on x86_64 to arch i686 fails
Currently cross compilation on x86_64 to arch i686 fails
with error:

> ctype-uca1400data.h
/bin/sh: 1: uca-dump: not found

Commit makes sure that uca-dump is treated correctly
when cross compiling MariaDB to another architecture
2023-02-22 16:01:46 +00:00
Alexander Barkov
33f8f92b74 MDEV-30695 Refactor case folding data types in Asian collations
This is a non-functional change and should not change the server behavior.

Casefolding information is now stored in items of a new data type MY_CASEFOLD_CHARACTER:

typedef struct casefold_info_char_t
{
  uint32 toupper;
  uint32 tolower;
} MY_CASEFOLD_CHARACTER;

Before this change, casefolding tables for Asian collations were stored in:

typedef struct unicase_info_char_st
{
  uint32 toupper;
  uint32 tolower;
  uint32 sort;
} MY_UNICASE_CHARACTER;

The "sort" member was not used in the code handling Asian collations,
it only wasted space.
(it's only used by Unicode _general_ci and _general_mysql500_ci collations).

Unicode collations (at least UCA and _bin) should also be refactored later,
but under terms of a separate task.
2023-02-21 14:10:25 +04:00
Alexander Barkov
7e341cc740 MDEV-30692 conf_to_src is not up to date
Fixing conf_to_src.c according to changes made by
 a206658b98

Re-generating ctype-extra.c at once, to fix the indentation
from manually edited to automatic.
2023-02-21 11:07:25 +04:00
Alexander Barkov
7f6b648d7d MDEV-30661 UPPER() returns an empty string for U+0251 in uca1400 collations for utf8
String length growth during upper/lower conversion
in Unicode collations depends only on the underlying MY_UNICASE_INFO
used in the collation.

Maintaining a separate member CHARSET_INFO::caseup_multiply and
CHARSET_INFO::casedn_multiply duplicated this information
and caused bugs like this (when MY_UNICASE_INFO and case??_multiply
when out of sync because of incomplete CHARSET_INFO initialization).

Fix:

Changing CHARSET_INFO::caseup_multiply and CHARSET_INFO::casedn_multiply
from members to virtual functions.
The virtual functions in Unicode collations calculate case conversion
growth factors from the MY_UNICASE_INFO. This guarantees that the growth
factors are always in sync with the MY_UNICASE_INFO.
2023-02-17 17:33:27 +04:00
Marko Mäkelä
1fd0099839 Merge 10.10 into 10.11 2023-02-16 11:41:18 +02:00
Marko Mäkelä
345356b868 Merge 10.9 into 10.10 2023-02-16 11:36:38 +02:00
Marko Mäkelä
0d55914d96 Merge 10.8 into 10.9 2023-02-16 10:25:34 +02:00
Marko Mäkelä
dbab3e8d90 Merge 10.6 into 10.8 2023-02-10 13:43:53 +02:00
Marko Mäkelä
6aec87544c Merge 10.5 into 10.6 2023-02-10 13:03:01 +02:00
Marko Mäkelä
c41c79650a Merge 10.4 into 10.5 2023-02-10 12:02:11 +02:00
Alexander Barkov
0845bce0d9 MDEV-30556 UPPER() returns an empty string for U+0251 in Unicode-5.2.0+ collations for utf8 2023-02-03 18:18:32 +04:00
Oleksandr Byelkin
c7c415734d Merge branch '10.10' into 10.11 2023-01-31 11:07:08 +01:00
Oleksandr Byelkin
76bcea3154 Merge branch '10.9' into 10.10 2023-01-31 11:01:48 +01:00
Oleksandr Byelkin
de2d089942 Merge branch '10.8' into 10.9 2023-01-31 10:37:31 +01:00
Oleksandr Byelkin
638625278e Merge branch '10.7' into 10.8 2023-01-31 09:57:52 +01:00
Oleksandr Byelkin
b923b80cfd Merge branch '10.6' into 10.7 2023-01-31 09:33:58 +01:00
Oleksandr Byelkin
c3a5cf2b5b Merge branch '10.5' into 10.6 2023-01-31 09:31:42 +01:00
Oleksandr Byelkin
7fa02f5c0b Merge branch '10.4' into 10.5 2023-01-27 13:54:14 +01:00
Sergei Golubchik
0c27559994 MDEV-26817 runtime error: index 24320 out of bounds for type 'json_string_char_classes [128] *and* ASAN: global-buffer-overflow on address ... READ of size 4 on SELECT JSON_VALID
protect from out-of-bound array access

it was already done in all other places, this one was the only one missed
2023-01-20 19:43:15 +01:00
Marko Mäkelä
3a237f7666 Merge 10.10 into 10.11 2023-01-11 11:13:56 +02:00
Marko Mäkelä
cae5a0328b Merge 10.9 into 10.10 2023-01-10 15:06:25 +02:00
Alexander Freiherr von Buddenbrock
0225159a8d MDEV-29381: SON paths containing dashes are reported as syntax errors in
procedures

MDEV-22224 caused the parsing of keys with hyphens to break by setting
the state transitions for parsing keys to JE_SYN (syntax error) when
they encounter a hyphen. However json key names may contain hyphens and
still be considered valid json.

This patch changes the state transition table so that key names with
hyphens remain valid. Note that unquoted key names in paths like
$.key-name are also valid again. This restores the previous behaviour
when hyphens were considered part of the P_ETC character class.
2023-01-06 12:55:51 +05:30
Marko Mäkelä
0aca3012a1 Merge 10.10 into 10.11 2022-12-14 09:18:30 +02:00
Marko Mäkelä
fa389b9098 Merge 10.9 into 10.10 2022-12-14 08:57:39 +02:00
Marko Mäkelä
b7914f562d Merge 10.8 into 10.9 2022-12-13 18:24:51 +02:00
Marko Mäkelä
d7a4ce3c80 Merge 10.7 into 10.8 2022-12-13 18:11:24 +02:00
Marko Mäkelä
25b91c3f13 Merge 10.6 into 10.7 2022-12-13 18:01:49 +02:00
Marko Mäkelä
a8a5c8a1b8 Merge 10.5 into 10.6 2022-12-13 16:58:58 +02:00
Marko Mäkelä
1dc2f35598 Merge 10.4 into 10.5 2022-12-13 14:39:18 +02:00
Marko Mäkelä
fdf43b5c78 Merge 10.3 into 10.4 2022-12-13 11:37:33 +02:00
Marko Mäkelä
64071d30bd Merge 10.10 into 10.11 2022-12-07 10:00:52 +02:00
Marko Mäkelä
3ff4eb07ed Merge 10.9 into 10.10 2022-12-07 09:49:38 +02:00
Marko Mäkelä
23f705f3a2 Merge 10.8 into 10.9 2022-12-07 09:43:38 +02:00
Marko Mäkelä
b3c254339b Merge 10.7 into 10.8 2022-12-07 09:43:13 +02:00
Marko Mäkelä
9e27e53dfa Merge 10.6 into 10.7 2022-12-07 09:39:46 +02:00
Marko Mäkelä
e55397a46d Merge 10.5 into 10.6 2022-12-05 18:04:23 +02:00
Jan Lindström
4eb8e51c26 Merge 10.4 into 10.5 2022-11-30 13:10:52 +02:00
Alexander Barkov
931549ff66 MDEV-27670 Assertion `(cs->state & 0x20000) == 0' failed in my_strnncollsp_nchars_generic_8bit
Also fixes:

MDEV-27768 MDEV-25440: Assertion `(cs->state & 0x20000) == 0' failed in my_strnncollsp_nchars_generic_8bit

The "strnncollsp_nchars" virtual function pointer for tis620_thai_nopad_ci
was incorrectly initialized to a generic function
my_strnncollsp_nchars_generic_8bit(), which crashed on assert.

Implementing a tis620 specific function version.
2022-11-22 14:03:23 +04:00
Alexander Barkov
6216a2dfa2 MDEV-29473 UBSAN: Signed integer overflow: X * Y cannot be represented in type 'int' in strings/dtoa.c
Fixing a few problems relealed by UBSAN in type_float.test

- multiplication overflow in dtoa.c

- uninitialized Field::geom_type (and Field::srid as well)

- Wrong call-back function types used in combination with SHOW_FUNC.
  Changes in the mysql_show_var_func data type definition were not
  properly addressed all around the code by the following commits:
    b4ff64568c
    18feb62fee
    0ee879ff8a

  Adding a helper SHOW_FUNC_ENTRY() function and replacing
  all mysql_show_var_func declarations using SHOW_FUNC
  to SHOW_FUNC_ENTRY, to catch mysql_show_var_func in the future
  at compilation time.
2022-11-17 17:51:01 +04:00
Jan Lindström
90608bd649 Merge 10.10 into 10.11 2022-09-06 11:32:54 +03:00
Alexander Barkov
f6118acda9 A follow-up patch MDEV-27266 Improve UCA collation performance for utf8mb3 and utf8mb4
Moving these members:

   CHARSET_INFO *cs;
   const MY_UCA_WEIGHT_LEVEL *level;

from my_uca_scanner to a new separate structure my_uca_scanner_param.

Rationale:

During a comparison of two strings these members were initialized two times
(one time for every string).

After the change these members initialized only one time inside
a shared instance of my_uca_scanner_param, and the instance is
shared between two scanners (its const address is passed as new a parameter
to the underlying scanner functions).

This change gives a slight performance improvement (~5%).
2022-09-02 13:23:24 +04:00
Marko Mäkelä
fe1f8f2c6b Merge 10.10 into 10.11 2022-08-30 13:36:30 +03:00
Marko Mäkelä
e71aca8200 Merge 10.9 into 10.10 2022-08-30 13:33:02 +03:00
Marko Mäkelä
50d6966c50 Merge 10.8 into 10.9 2022-08-30 13:22:57 +03:00
Marko Mäkelä
c8cd162a0a Merge 10.7 into 10.8 2022-08-30 13:04:17 +03:00
Marko Mäkelä
b86be02ecf Merge 10.6 into 10.7 2022-08-30 13:02:42 +03:00