Commit graph

9 commits

Author SHA1 Message Date
Alexander Barkov
36eba98817 MDEV-19123 Change default charset from latin1 to utf8mb4
Changing the default server character set from latin1 to utf8mb4.
2024-07-11 10:21:07 +04:00
Marko Mäkelä
7c7ac6d4a4 Merge 10.6 into 10.7 2022-09-21 09:33:07 +03:00
Vladislav Vaintroub
a6cf8b34a8 MDEV-26806 Server crash in Charset::charset / Item_func_natural_sort_key::val_str
The reason for crash is that natural_sort_key(release_lock('a')) would
evaluate release_lock() twice, once in Item::is_null() and another time
in Item::val_str(). Second time it returns NULL, since lock was already
released.

Fixed to prevent double evaluation.
2021-10-14 12:13:05 +02:00
Vladislav Vaintroub
bc09362eb3 MDEV-26796 Natural sort does not work for utf32/utf16/ucs2
Fixed typo, added test.
2021-10-14 12:13:05 +02:00
Vladislav Vaintroub
5b5a67b2a9 MDEV-26786 Inserting NULL into base column breaks NATURAL_SORT_KEY column
When returning non-null value from natural_sort_key, make sure
Item::null_value is false.
2021-10-14 12:13:05 +02:00
Vladislav Vaintroub
6c5c1fd55a MDEV-4742 - remove leading zero handling, and cleanups.
Leading zeros added a single byte overhead per numeric string,
even when they were. Sorting leading zeros offers only for little value
(except determinism in sort). I decided to drop it for now, we can be
like ICU, which drops leading zeros, in numeric sorting,
even with IDENTICAL collation strength.


Also, disabled virtual stored columns (thus also indexes), on Serg's request
Hopefully it is temporarily, and will be reenabled soon, when everyone is
as happy with key generation algorithm as I am.
2021-10-14 12:13:05 +02:00
Vladislav Vaintroub
167d250924 MDEV-4742 additions
- return error from natsort_encode_numeric_key, if it would need
to allocate memory. All needed memory was preallocated much earlier.

- Add test for sort order of leading zero vs numeric strings with suffix.
2021-10-14 12:13:04 +02:00
Vladislav Vaintroub
b3cedf63a3 MDEV-4742 - address review comments.
- Remove second optional parameter to natural_sort_key(), and all fraction
handling.

- Rename natsort_num2str() to natsort_encode_length() to show the intention
that it encodes string *lengths*, and not encode whitespaces and what not.

Handles lengths for which log10(len) >= 10,  even if they do not happen for
MariaDB Strings (where length is limited by 32bit, and log10(len) is <= 9)

- Do not let natural sort key grow past max_packet_length.


- Split Item_func_natural_sort_key::val_str() further and add
natsort_encode_numeric_string(), which contains comment on how
whitespaces are handled.

- Simplify, and speedup to_natsort_key() in common case, by removing
handling of weird charsets utf16/32, that encode numbers in several bytes.
In rare cases utf16/32 is used, we'll convert to utf8 prior to
creating keys, and back to original charset afterwards.
2021-10-14 12:13:04 +02:00
Vladislav Vaintroub
5b29d407f6 MDEV-4742 - provide function to sort string in "natural" order
The numbers should be compared as numbers,  while the rest should be compared
as string.

Introduce natural_sort_key() function that transforms original string
so that the lexicographic order of such keys is suitable for
natural sort.
2021-10-14 12:13:04 +02:00