mariadb/mysys
Alexander Barkov 1748b02ec6 MDEV-36213 Doubled memory usage (11.4.4 <-> 11.4.5)
Fixing the code adding MySQL _0900_ collations as _uca1400_ aliases
not to perform deep initialization of the corresponding _uca1400_
collations.

Only basic initialization is now performed which allows to watch
these collations (both _0900_ and _uca1400_) in queries to
INFORMATION_SCHEMA tables COLLATIONS and
COLLATION_CHARACTER_SET_APPLICABILITY,
as well as in SHOW COLLATION statements.

Deep initialization is now performed only when a collation
(either the _0900_ alias or the corresponding  _uca1400_ collation)
is used for the very first time after the server startup.

Refactoring was done to maintain the code easier:
- most of the _uca1400_ code was moved from ctype-uca.c
  to a new file ctype-uca1400.c
- most of the _0900_ code was moved from type-uca.c
  to a new file ctype-uca0900.c

Change details:

- The original function add_alias_for_collation() added by the patch for
   "MDEV-20912 Add support for utf8mb4_0900_* collations in MariaDB Server"
  was removed from mysys/charset.c, as it had two two problems:

  a. it forced deep initialization of the _uca1400_ collations
     when adding _0900_ aliases for them at the server startup
     (the main reported problem)

  b. it introduced cyclic dependency between /mysys and /strings
     - /mysys/charset-def.c depended on /strings/ctype-uca.c
     - /strings/ctype-uca.c depended on /mysys/charset.c

  The code from add_alias_for_collation() was split into separate functions.
  Cyclic dependency was removed. `#include <my_sys.h>` was removed
  from /strings/ctype-uca.c. Collations are now added using a callback
  function MY_CHARSET_LOADED::add_collation, like it is done for
  user collations defined in Index.xml. The code in /mysys sets
  MY_CHARSET_LOADED::add_collation to add_compiled_collation().

- The function compare_collations() was removed.
  A new virtual function was added into my_collation_handler_st instead:

    my_bool (*eq_collation)(CHARSET_INFO *self, CHARSET_INFO *other);

  because it is the collation handler who knows how to detect equal
  collations by comparing only some of CHARSET_INFO members without
  their deep initialization.

  Three implementations were added:
  - my_ci_eq_collation_uca() for UCA collations, it compares
    _0900_ collations as equal to their corresponding _uca1400_ collations.
  - my_ci_eq_collation_utf8mb4_bin(), it compares
    utf8mb4_nopad_bin and utf8mb4_0900_bin as equal.
  - my_ci_eq_collation_generic() - the default implementation,
    which compares all collations as not equal.

  A C++ wrapper CHARSET_INFO::eq_collations() was added.
  The code in /sql was changes to use the wrapper instead of
  the former calls for the removed function compare_collations().

- A part of add_alias_for_collation() was moved into a new function
  my_ci_alloc(). It allocates a memory for a new charset_info_st
  instance together with the collation name and the comment using a single
  MY_CHARSET_LOADER::once_alloc call, which normally points to my_once_alloc().

- A part of add_alias_for_collation() was moved into a new function
  my_ci_make_comment_for_alias(). It makes an "Alias for xxx" string,
  e.g. "Alias for utf8mb4_uca1400_swedish_ai_ci" in case of
  utf8mb4_sv_0900_ai_ci.

- A part of the code in create_tailoring() was moved to
  a new function my_uca1400_collation_get_initialized_shared_uca(),
  to reuse the code between _uca1400_ and _0900_ collations.

- A new function my_collation_id_is_mysql_uca0900() was added
  in addition to my_collation_id_is_mysql_uca1400().

- Functions to build collation names were added:
   my_uca0900_collation_build_name()
   my_uca1400_collation_build_name()

- A shared function function was added:

  my_bool
  my_uca1400_collation_alloc_and_init(MY_CHARSET_LOADER *loader,
                                      LEX_CSTRING name,
                                      LEX_CSTRING comment,
                                      const uca_collation_def_param_t *param,
                                      uint id)

  It's reused to add _uca1400_ and _0900_ collations, with basic
  initialization (without deep initialization).

- The function add_compiled_collation() changed its return type from
  void to int, to make it compatible with MY_CHARSET_LOADER::add_collation.

- Functions mysql_uca0900_collation_definition_add(),
  mysql_uca0900_utf8mb4_collation_definitions_add(),
  mysql_utf8mb4_0900_bin_add() were added into ctype-uca0900.c.
  They get MY_CHARSET_LOADER as a parameter.

- Functions my_uca1400_collation_definition_add(),
  my_uca1400_collation_definitions_add() were moved from
  charset-def.c to strings/ctype-uca1400.c.
  The latter now accepts MY_CHARSET_LOADER as the first parameter
  instead of initializing a MY_CHARSET_LOADER inside.

- init_compiled_charsets() now initializes a MY_CHARSET_LOADER
  variable and passes it to all functions adding collations:
  - mysql_utf8mb4_0900_collation_definitions_add()
  - mysql_uca0900_utf8mb4_collation_definitions_add()
  - mysql_utf8mb4_0900_bin_add()

- A new structure was added into ctype-uca.h:

  typedef struct uca_collation_def_param
  {
    my_cs_encoding_t cs_id;
    uint tailoring_id;
    uint nopad_flags;
    uint level_flags;
  } uca_collation_def_param_t;

  It simplifies reusing the code for _uca1400_ and _0900_ collations.

- The definition of MY_UCA1400_COLLATION_DEFINITION was
  moved from ctype-uca.c to ctype-uca1400.h, to reuse
  the code for _uca1400_ and _0900_ collations.

- The definitions of "MY_UCA_INFO my_uca_v1400" and
  "MY_UCA_INFO my_uca1400_info_tailored[][]" were moved from
  ctype-uca.c to ctype-uca1400.c.

- The definitions/declarations of:
  - mysql_0900_collation_start,
  - struct mysql_0900_to_mariadb_1400_mapping
  - mysql_0900_to_mariadb_1400_mapping
  - mysql_utf8mb4_0900_collation_definitions_add()
  were moved from ctype-uca.c to ctype-uca0900.c

- Functions
  my_uca1400_make_builtin_collation_id()
  my_uca1400_collation_definition_init()
  my_uca1400_collation_id_uca400_compat()
  my_ci_get_collation_name_uca1400_context()
  were moved from ctype-uca.c to ctype-uca1400.c and ctype-uca1400.h

- A part of my_uca1400_collation_definition_init()
  was moved into my_uca1400_collation_source(),
  to make functions smaller.
2025-03-31 18:17:26 +04:00
..
crc32 Merge branch '11.2' into 11.4 2024-09-18 11:27:53 +10:00
array.c Merge 10.11 into 11.4 2025-01-09 07:58:08 +02:00
base64.c Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
ChangeLog
charset-def.c MDEV-36213 Doubled memory usage (11.4.4 <-> 11.4.5) 2025-03-31 18:17:26 +04:00
charset.c MDEV-36213 Doubled memory usage (11.4.4 <-> 11.4.5) 2025-03-31 18:17:26 +04:00
CMakeLists.txt Merge 10.11 into 11.4 2025-03-28 13:55:21 +02:00
crc32ieee.cc Merge branch '10.6' into 10.11 2024-05-10 20:02:18 +02:00
errors.c Merge branch '10.11' into 11.0 2024-05-12 12:18:28 +02:00
file_logger.c Merge 10.4 into 10.5 2021-03-05 12:54:43 +02:00
get_password.c Merge branch '10.6' into 10.9 2023-08-04 08:01:06 +02:00
guess_malloc_library.c Fixed compiler warnings in guess_malloc_library 2018-01-15 16:44:44 +02:00
hash.c Merge 10.6 into 10.11 2024-11-29 13:43:17 +02:00
lf_alloc-pin.c Fix a stack overflow in pinbox allocator 2024-07-05 13:26:37 +10:00
lf_dynarray.c perfschema memory related instrumentation changes 2020-03-10 19:24:22 +01:00
lf_hash.cc MDEV-27088: Server crash on ARM (WMM architecture) due to missing barriers in lf-hash (10.5) 2021-11-30 15:16:16 +11:00
list.c Merge 10.4 into 10.5 2020-05-13 14:25:06 +03:00
ma_dyncol.c Merge 10.5 into 10.6 2024-05-30 14:27:07 +03:00
mf_arr_appstr.c Update FSF Address 2019-05-11 21:29:06 +03:00
mf_cache.c Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
mf_dirname.c MDEV-21581 Helper functions and methods for CHARSET_INFO 2020-01-28 12:29:23 +04:00
mf_fn_ext.c Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
mf_format.c Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
mf_getdate.c Update FSF Address 2019-05-11 21:29:06 +03:00
mf_iocache.c MDEV-35806: Error in read_log_event() corrupts relay log writer, crashes server 2025-01-24 09:15:20 +00:00
mf_iocache2.c MDEV-31273: Precompute binlog checksums 2023-10-27 19:57:43 +02:00
mf_keycache.c MDEV-36056 Fix VS2019 compilation 2025-02-10 15:27:08 +01:00
mf_keycaches.c Update FSF Address 2019-05-11 21:29:06 +03:00
mf_loadpath.c Update FSF Address 2019-05-11 21:29:06 +03:00
mf_pack.c MDEV-21581 Helper functions and methods for CHARSET_INFO 2020-01-28 12:29:23 +04:00
mf_path.c MDEV-25602 get rid of __WIN__ in favor of standard _WIN32 2021-06-06 13:21:03 +02:00
mf_qsort.c MDEV-34348: Consolidate cmp function declarations 2024-11-23 08:14:22 -07:00
mf_qsort2.c Update FSF Address 2019-05-11 21:29:06 +03:00
mf_radix.c cleanup: Typo fix appliccable -> applicable 2023-01-30 15:24:15 +02:00
mf_same.c Update FSF Address 2019-05-11 21:29:06 +03:00
mf_sort.c cleanup: Typo fix appliccable -> applicable 2023-01-30 15:24:15 +02:00
mf_soundex.c Update FSF Address 2019-05-11 21:29:06 +03:00
mf_tempdir.c MDEV-25602 get rid of __WIN__ in favor of standard _WIN32 2021-06-06 13:21:03 +02:00
mf_tempfile.c MDEV-33379 innodb_log_file_buffering=OFF causes corruption on bcachefs 2024-02-20 13:43:19 +02:00
mf_unixpath.c Update FSF Address 2019-05-11 21:29:06 +03:00
mf_wcomp.c Update FSF Address 2019-05-11 21:29:06 +03:00
mulalloc.c Added detection of memory overwrite with multi_malloc 2023-02-27 19:25:44 +02:00
my_access.c MDEV-25602 get rid of __WIN__ in favor of standard _WIN32 2021-06-06 13:21:03 +02:00
my_addr_resolve.c Backport my_addr_resolve from 10.6 to get latest bug fixes in. 2023-11-27 19:08:14 +02:00
my_alloc.c Merge 10.11 into 11.4 2025-01-09 07:58:08 +02:00
my_atomic_writes.c Merge 10.5 into 10.6 2023-04-11 16:15:19 +03:00
my_basename.c Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00
my_bit.c Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
my_bitmap.c Merge branch '11.1' into 11.2 2024-04-09 12:12:33 +02:00
my_chmod.c Merge branch '5.5' into 10.1 2019-05-11 19:15:57 +03:00
my_chsize.c Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
my_compare.c Merge branch '10.4' into 10.5 2023-11-08 12:59:00 +01:00
my_compress.c Cleanup: Remove IF_VALGRIND 2022-04-25 09:40:40 +03:00
my_copy.c MDEV-25602 get rid of __WIN__ in favor of standard _WIN32 2021-06-06 13:21:03 +02:00
my_cpu.c MDEV-19845: Make my_cpu.h self-contained 2020-02-01 14:56:05 +02:00
my_create.c Change my_umask{,_dir} to mode_t and remove os_innodb_umask 2024-12-11 17:21:01 +11:00
my_default.c MDEV-21375: Get option group suffix from $MARIADB_GROUP_SUFFIX in addition to $MYSQL_GROUP_SUFFIX 2025-03-24 15:36:35 +04:00
my_delete.c Merge 10.6 into 10.7 2022-12-13 18:01:49 +02:00
my_div.c Update FSF Address 2019-05-11 21:29:06 +03:00
my_dlerror.c Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
my_error.c Merge branch '10.4' into 10.5 2020-11-01 14:26:15 +01:00
my_file.c perfschema memory related instrumentation changes 2020-03-10 19:24:22 +01:00
my_fopen.c MDEV-27142 - postfix 2022-11-04 13:50:36 +01:00
my_fstream.c Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
my_getexe.c MDEV-34340 mariadb-backup immediately dumps core on NetBSD 2024-10-16 11:46:19 +11:00
my_gethwaddr.c OS detection logic in my_gethwaddr.c is backwards 2022-11-13 13:12:37 +11:00
my_getncpus.c MDEV-25602 get rid of __WIN__ in favor of standard _WIN32 2021-06-06 13:21:03 +02:00
my_getopt.c Merge 10.11 into 11.4 2025-03-28 13:55:21 +02:00
my_getpagesize.c MDEV-34062: Implement innodb_log_file_mmap on 64-bit systems 2024-09-26 18:47:12 +03:00
my_getsystime.c Windows : fix warning about potential division by 0 2021-05-09 13:26:03 +02:00
my_getwd.c MDEV-25602 get rid of __WIN__ in favor of standard _WIN32 2021-06-06 13:21:03 +02:00
my_init.c Merge 10.11 into 11.4 2025-01-09 07:58:08 +02:00
my_largepage.c MDEV-29445: Reimplement SET GLOBAL innodb_buffer_pool_size 2025-03-26 17:05:44 +02:00
my_lib.c MDEV-34348: Consolidate cmp function declarations 2024-11-23 08:14:22 -07:00
my_libwrap.c Update FSF Address 2019-05-11 21:29:06 +03:00
my_likely.c MDEV-34348: my_hash_get_key fixes 2024-11-23 08:14:22 -07:00
my_lock.c MDEV-32567 Remove thr_alarm from server codebase 2023-11-23 11:52:38 +11:00
my_lockmem.c Merge 10.6 into 10.10 2023-10-14 13:36:11 +03:00
my_malloc.c Merge 11.0 into 11.1 2023-10-19 08:26:16 +03:00
my_memmem.c Update FSF Address 2019-05-11 21:29:06 +03:00
my_mess.c MDEV-23846: O_TMPFILE error in mysqlbinlog stream output breaks restore 2020-11-23 12:16:45 +05:30
my_minidump.cc MDEV-11499 mysqltest, Windows : improve diagnostics if server fails to shutdown 2021-09-24 11:49:28 +02:00
my_mkdir.c MDEV-25602 get rid of __WIN__ in favor of standard _WIN32 2021-06-06 13:21:03 +02:00
my_mmap.c libpmem cmake macros 2020-02-04 23:23:50 +04:00
my_new.cc Fixes that enables my_new.cc (new wrapper using my_malloc) 2021-05-19 22:27:27 +02:00
my_once.c Add memory allocated by my_once_alloc() to memory status 2025-03-31 15:41:16 +04:00
my_open.c Merge 10.5 into 10.6 2024-12-11 14:46:43 +02:00
my_port.c Follow-up to changing FSF address 2019-05-11 18:30:45 +03:00
my_pread.c MDEV-33813 ERROR 1021 (HY000): Disk full (./org/test1.MAI); waiting for someone to free some space... (errno: 28 "No space left on device") 2025-03-06 09:40:55 +02:00
my_pthread.c MDEV-32567 Remove thr_alarm from server codebase 2023-11-23 11:52:38 +11:00
my_quick.c Update FSF Address 2019-05-11 21:29:06 +03:00
my_rdtsc.c Merge remote-tracking branch 'origin/10.6' into 10.11 2024-07-08 21:52:08 +04:00
my_read.c Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
my_redel.c MDEV-25602 get rid of __WIN__ in favor of standard _WIN32 2021-06-06 13:21:03 +02:00
my_rename.c Merge 10.5 into 10.6 2022-12-13 16:58:58 +02:00
my_rnd.c remove dead code 2022-07-31 14:54:37 +02:00
my_safehash.c MDEV-34348: my_hash_get_key fixes 2024-11-23 08:14:22 -07:00
my_safehash.h Update FSF address 2019-05-10 20:52:00 +03:00
my_seek.c myseek: AIX has no "tell" 2021-03-19 11:14:53 +11:00
my_setuser.c mysys: rename ME_xxx flags to match plugin api 2018-06-04 12:32:23 +02:00
my_sleep.c MDEV-25602 get rid of __WIN__ in favor of standard _WIN32 2021-06-06 13:21:03 +02:00
my_stack.c Fixup bddbef3573 2024-10-31 10:01:01 +01:00
my_static.c Merge 10.11 into 11.4 2025-01-09 07:58:08 +02:00
my_static.h Update FSF Address 2019-05-11 21:29:06 +03:00
my_symlink.c Merge branch '10.11' into 11.2 2024-10-29 16:42:46 +01:00
my_symlink2.c Change my_umask{,_dir} to mode_t and remove os_innodb_umask 2024-12-11 17:21:01 +11:00
my_sync.c MDEV-381: fdatasync() does not correctly flush growing binlog file 2023-08-10 19:52:04 +02:00
my_thr_init.c MDEV-34077 scripts/mariadb-install-db: Error in my_thread_global_end(): 1 threads didn't exit 2024-05-05 21:37:08 +02:00
my_timezone.cc MDEV-33096 mysys/my_timezone.cc does not compile on AIX 2023-12-22 13:17:55 +01:00
my_uuid.c cleanup: uuid 2021-10-29 18:29:01 +02:00
my_virtual_mem.c MDEV-29445: Reimplement SET GLOBAL innodb_buffer_pool_size 2025-03-26 17:05:44 +02:00
my_win_popen.cc Ensure that source files contain only valid UTF8 encodings (#2188) 2023-05-19 13:21:34 +01:00
my_wincond.c MDEV-25602 get rid of __WIN__ in favor of standard _WIN32 2021-06-06 13:21:03 +02:00
my_winerr.c Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00
my_winfile.c Fixes to make dbug traces from Windows easier to compare with Unix traces 2023-03-02 13:11:54 +02:00
my_winthread.c Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00
my_wintoken.c Merge pull request #1221 from grooverdan/10.4-MDEV-18851-multiple-sized-large-page-support 2020-04-02 23:54:08 +04:00
my_write.c Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
mysys_priv.h MDEV-32567 Remove thr_alarm from server codebase 2023-11-23 11:52:38 +11:00
psi_noop.c Merge branch 'merge-perfschema-5.7' into 10.5 2022-08-02 09:34:15 +02:00
ptr_cmp.c MDEV-34348: Consolidate cmp function declarations 2024-11-23 08:14:22 -07:00
queues.c MDEV-34348: Consolidate cmp function declarations 2024-11-23 08:14:22 -07:00
safemalloc.c Merge branch '10.5' into 10.6 2023-12-17 11:20:43 +01:00
stacktrace.c MDEV-30573 Server doesn't build with GCOV by GCC 11+ 2023-02-06 21:25:02 +11:00
string.c MDEV-25602 get rid of __WIN__ in favor of standard _WIN32 2021-06-06 13:21:03 +02:00
test_charset.c MDEV-8334: Rename utf8 to utf8mb3 2021-05-19 06:48:36 +02:00
test_dir.c Update FSF Address 2019-05-11 21:29:06 +03:00
test_thr_mutex.c Update FSF address 2019-05-10 20:52:00 +03:00
test_xml.c Update FSF Address 2019-05-11 21:29:06 +03:00
testhash.c Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
thr_lock.c Merge branch '11.1' into 11.2 2024-02-02 17:43:57 +01:00
thr_mutex.c Merge 10.6 into 10.11 2024-06-11 12:50:10 +03:00
thr_rwlock.c MDEV-34530 dead code in the thr_rwlock.c 2024-07-17 21:25:40 +02:00
thr_timer.c MDEV-35574 remove obsolete pthread_exit calls 2024-12-10 12:12:20 +11:00
tree.c MDEV-28130 MariaDB SEGV issue at tree_search_next 2025-01-14 18:56:14 +03:00
typelib.c Added 'const' to arguments in get_one_option and find_typeset() 2021-02-08 12:16:29 +02:00
waiting_threads.c Merge 10.6 into 10.11 2024-11-29 13:43:17 +02:00
wqueue.c Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00