Commit graph

16 commits

Author SHA1 Message Date
Oleksandr Byelkin
4fb2cb1a30 Merge branch '10.7' into 10.8 2022-02-04 14:50:25 +01:00
Oleksandr Byelkin
f5c5f8e41e Merge branch '10.5' into 10.6 2022-02-03 17:01:31 +01:00
Oleksandr Byelkin
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
Alexander Barkov
b915f79e4e MDEV-25904 New collation functions to compare InnoDB style trimmed NO PAD strings 2022-01-21 12:16:07 +04:00
Alexander Barkov
47463e5796 MDEV-27552 Change the return type of my_uca_context_weight_find() to MY_CONTRACTION* 2022-01-20 15:44:13 +04:00
Marko Mäkelä
4f7574b10c MDEV-27042 fixup: GCC 11 -Og -Wmaybe-uninitialized 2021-11-29 09:24:58 +02:00
Alexander Barkov
f9ad8072cd MDEV-27042 UCA: Resetting contractions to ignorable does not work well
The weight scanner routine scanner_next() did not properly handle the cases
when a contraction produces no weights (is ignorable).

Adding a helper routine my_uca_scanner_set_weight() and using
it in all cases:

- A single ASCII character
- A contraction starting with an ASCII character
- A multi-byte character
- A contraction starting with a multi-byte character

Also adding two other helper routines:

- my_uca_scanner_next_expansion_weight()
- my_uca_scanner_set_weight_outside_maxchar()

to avoid using scanner->wbeg directly inside scanner_next().
This reduces the probability of similar future bugs.
2021-11-24 13:45:35 +04:00
Alexander Barkov
0a3d1d106a Refactoring for MDEV-27042 and MDEV-27009
This patch prepares the code for upcoming changes:

MDEV-27009 Add UCA-14.0.0 collations
MDEV-27042 UCA: Resetting contractions to ignorable does not work well

1. Adding "const" qualifiers to return type and parameters in functions:
- my_uca_contraction2_weight()
- my_wmemcmp()
- my_uca_contraction_weight()
- my_uca_scanner_contraction_find()
- my_uca_previous_context_find()
- my_uca_context_weight_find()

2. Adding a helper function my_uca_true_contraction_eq()

3. Changing the way how scanner->wbeg is set during context weight handling.
   It was previously set inside functions:
   - my_uca_scanner_contraction_find()
   - my_uca_previous_context_find()
   Now it's set inside scanner_next(), which makes the code more symmetric
   for context-free and context-dependent sequences.
   This makes then upcoming fix for MDEV-27042 simpler.
2021-11-24 13:35:57 +04:00
Alexander Barkov
0d68b0a2d6 MDEV-26669 Add MY_COLLATION_HANDLER functions min_str() and max_str() 2021-09-27 17:10:22 +04:00
Alexander Barkov
f1e13fdc8d MDEV-21581 Helper functions and methods for CHARSET_INFO 2020-01-28 12:29:23 +04:00
Alexander Barkov
3e7e87ddcc MDEV-19897 Rename source code variable names from utf8 to utf8mb3 2019-06-28 12:37:04 +04:00
Alexander Barkov
4272eec050 MDEV-17534 Implement fast path for ASCII range in strnxfrm_onelevel_internal() 2018-10-24 15:12:38 +04:00
Alexander Barkov
88cfde26e8 A cleanup for MDEV-17511. Re-enabling ctype_ldml.test. 2018-10-21 21:28:11 +04:00
Alexander Barkov
3fe2b86627 MDEV-17511 Improve performance for ORDER BY with a CHAR(N) CHARACTER SET utf8_unicode_ci 2018-10-21 05:02:38 +04:00
Alexander Barkov
475c6ec551 MDEV-17474 Change Unicode collation implementation from "handler" to "inline" style (part#2)
Additional changes:

1. Adding a fast path for ASCII characters
2. Adding dedicated MY_COLLATION_HANDLERs for collations with no contractions
   (for utf8 and for utf8mb4 character sets). The choice between
   the full-featured handler and the "no contraction" handler is
   made at the collation initialization time.
2018-10-18 07:49:58 +04:00
Alexander Barkov
6eae037c4c MDEV-17474 Change Unicode collation implementation from "handler" to "inline" style 2018-10-17 06:44:40 +04:00