MDEV-20912 Add support for utf8mb4_0900_* collations in MariaDB Server

This is done by mapping most of the existing MySQL unicode 0900 collations
to MariadB 1400 unicode collations. The assumption is that 1400 is a super
set of 0900 for all practical purposes.

I also added a new function 'compare_collations()' and changed most code
to use this instead of comparing character sets directly.
This enables one to seamlessly mix-and-match the corresponding 0900 and
1400 sets. Field comparision and alter table treats the character sets
as identical.

All MySQL 8.0 0900 collations are supported except:
- utf8mb4_ja_0900_as_cs
- utf8mb4_ja_0900_as_cs_ks
- utf8mb4_ru_0900_as_cs
- utf8mb4_zh_0900_as_cs

These do not have corresponding entries in the MariadB 01400 collations.

Other things:
- Added COMMENT colum to information_schema.collations. For utf8mb4_0900
  colletions it contains the corresponding alias collation.
This commit is contained in:
Monty 2024-12-15 15:57:53 +02:00
commit 7fcaab7aaa
21 changed files with 6284 additions and 102 deletions

View file

@ -388,8 +388,8 @@ select "foo" = "foo " collate latin1_test;
"foo" = "foo " collate latin1_test
1
The following tests check that two-byte collation IDs work
select * from information_schema.collations where id>256 and is_compiled<>'Yes' order by id;
COLLATION_NAME CHARACTER_SET_NAME ID IS_DEFAULT IS_COMPILED SORTLEN
select collation_name, character_set_name, id, is_default, is_compiled, sortlen from information_schema.collations where id>256 and is_compiled<>'Yes' order by id;
collation_name character_set_name id is_default is_compiled sortlen
ascii2_general_nopad_ci ascii2 318 1
ascii2_bin2 ascii2 319 1
ascii2_general_ci ascii2 320 Yes 1

View file

@ -171,7 +171,7 @@ select "foo" = "foo " collate latin1_test;
# The file ../std-data/Index.xml has a number of collations with high IDs.
# Test that the "ID" column in I_S and SHOW queries can handle two bytes
select * from information_schema.collations where id>256 and is_compiled<>'Yes' order by id;
select collation_name, character_set_name, id, is_default, is_compiled, sortlen from information_schema.collations where id>256 and is_compiled<>'Yes' order by id;
show collation like '%test%';
# Test that two-byte collation ID is correctly transfered to the client side.