Commit graph

14 commits

Author SHA1 Message Date
ParadoxV5
aaa02f6aa3 MDEV-35693: Improve SHOW REPLICA STATUS column sizes
Resize the types and widths of SHOW REPLICA STATUS
(technically `INFORMATION_SCHEMA.SLAVE_STATUS`)
columns to better match their possible values

In case of intentionally but absurdly long lists,
text columns that list an uncapped number of elements
have expanded to accept as many bytes as we could support.

Particularly, the first-gen `Replicate_` filters were
incorrectly typed as singlular `Name()`s during MDEV-33526.
Under `Name`s’ 64-char limit, they could overflow
(read: truncate) even before their lengths got absurd.

In response to `‘MAX_SLAVE_ERRMSG’ was not declared in this scope` in
Embedded builds, a new `#ifdef HAVE_REPLICATION` guard wraps
`slave_status_info` to skip this unused data in Replication-less builds.

For testing, this commit forward-ports a modified cherry-pick of 
(the latter targets our oldest maintained LTS as part of MDEV-35948).

> Assert that 1st-gen `replicate_*` filter variables display
> their input – including long but reasonable lists –
> correctly (without truncation) in
> * direct SELECT
> * [semi-new] INFORMATION_SCHEMA.GLOBAL_VARIABLES.VARIABLE_VALUE
> * [new] SHOW REPLICA STATUS

Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
2025-01-31 20:56:41 -07:00
Alexander Barkov
fd247cc21f MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp()
This patch also fixes:
  MDEV-33050 Build-in schemas like oracle_schema are accent insensitive
  MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
  MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
  MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
  MDEV-33088 Cannot create triggers in the database `MYSQL`
  MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
  MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
  MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
  MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
  MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0

- Removing the virtual function strnncoll() from MY_COLLATION_HANDLER

- Adding a wrapper function CHARSET_INFO::streq(), to compare
  two strings for equality. For now it calls strnncoll() internally.
  In the future it will turn into a virtual function.

- Adding new accent sensitive case insensitive collations:
    - utf8mb4_general1400_as_ci
    - utf8mb3_general1400_as_ci
  They implement accent sensitive case insensitive comparison.
  The weight of a character is equal to the code point of its
  upper case variant. These collations use Unicode-14.0.0 casefolding data.

  The result of
     my_charset_utf8mb3_general1400_as_ci.strcoll()
  is very close to the former
     my_charset_utf8mb3_general_ci.strcasecmp()

  There is only a difference in a couple dozen rare characters, because:
    - the switch from "tolower" to "toupper" comparison, to make
      utf8mb3_general1400_as_ci closer to utf8mb3_general_ci
    - the switch from Unicode-3.0.0 to Unicode-14.0.0
  This difference should be tolarable. See the list of affected
  characters in the MDEV description.

  Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters!
  Unlike utf8mb4_general_ci, it does not treat all BMP characters
  as equal.

- Adding classes representing names of the file based database objects:

    Lex_ident_db
    Lex_ident_table
    Lex_ident_trigger

  Their comparison collation depends on the underlying
  file system case sensitivity and on --lower-case-table-names
  and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci.

- Adding classes representing names of other database objects,
  whose names have case insensitive comparison style,
  using my_charset_utf8mb3_general1400_as_ci:

  Lex_ident_column
  Lex_ident_sys_var
  Lex_ident_user_var
  Lex_ident_sp_var
  Lex_ident_ps
  Lex_ident_i_s_table
  Lex_ident_window
  Lex_ident_func
  Lex_ident_partition
  Lex_ident_with_element
  Lex_ident_rpl_filter
  Lex_ident_master_info
  Lex_ident_host
  Lex_ident_locale
  Lex_ident_plugin
  Lex_ident_engine
  Lex_ident_server
  Lex_ident_savepoint
  Lex_ident_charset
  engine_option_value::Name

- All the mentioned Lex_ident_xxx classes implement a method streq():

  if (ident1.streq(ident2))
     do_equal();

  This method works as a wrapper for CHARSET_INFO::streq().

- Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name"
  in class members and in function/method parameters.

- Replacing all calls like
    system_charset_info->coll->strcasecmp(ident1, ident2)
  to
    ident1.streq(ident2)

- Taking advantage of the c++11 user defined literal operator
  for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h)
  data types. Use example:

  const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column;

  is now a shorter version of:

  const Lex_ident_column primary_key_name=
    Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
2024-04-18 15:22:10 +04:00
Alexander Barkov
133446828c MDEV-27009 Add UCA-14.0.0 collations
- Added one neutral and 22 tailored (language specific) collations based on
  Unicode Collation Algorithm version 14.0.0.

  Collations were added for Unicode character sets
  utf8mb3, utf8mb4, ucs2, utf16, utf32.

  Every tailoring was added with four accent and case
  sensitivity flag combinations, e.g:

  * utf8mb4_uca1400_swedish_as_cs
  * utf8mb4_uca1400_swedish_as_ci
  * utf8mb4_uca1400_swedish_ai_cs
  * utf8mb4_uca1400_swedish_ai_ci

  and their _nopad_ variants:

  * utf8mb4_uca1400_swedish_nopad_as_cs
  * utf8mb4_uca1400_swedish_nopad_as_ci
  * utf8mb4_uca1400_swedish_nopad_ai_cs
  * utf8mb4_uca1400_swedish_nopad_ai_ci

- Introducing a conception of contextually typed named collations:

  CREATE DATABASE db1 CHARACTER SET utf8mb4;
  CREATE TABLE db1.t1 (a CHAR(10) COLLATE uca1400_as_ci);

  The idea is that there is no a need to specify the character set prefix
  in the new collation names. It's enough to type just the suffix
  "uca1400_as_ci". The character set is taken from the context.

  In the above example script the context character set is utf8mb4.
  So the CREATE TABLE will make a column with the collation
  utf8mb4_uca1400_as_ci.

  Short collations names can be used in any parts of the SQL syntax
  where the COLLATE clause is understood.

- New collations are displayed only one time
  (without character set combinations) by these statements:

     SELECT * FROM INFORMATION_SCHEMA.COLLATIONS;
     SHOW COLLATION;

  For example, all these collations:
  - utf8mb3_uca1400_swedish_as_ci
  - utf8mb4_uca1400_swedish_as_ci
  - ucs2_uca1400_swedish_as_ci
  - utf16_uca1400_swedish_as_ci
  - utf32_uca1400_swedish_as_ci
  have just one entry in INFORMATION_SCHEMA.COLLATIONS and SHOW COLLATION,
  with COLLATION_NAME equal to "uca1400_swedish_as_ci", which is the suffix
  without the character set name:

SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.COLLATIONS
WHERE COLLATION_NAME LIKE '%uca1400_swedish_as_ci';

+-----------------------+
| COLLATION_NAME        |
+-----------------------+
| uca1400_swedish_as_ci |
+-----------------------+

  Note, the behaviour of old collations did not change.
  Non-unicode collations (e.g. latin1_swedish_ci) and
  old UCA-4.0.0 collations (e.g. utf8mb4_unicode_ci)
  are still displayed with the character set prefix, as before.

- The structure of the table INFORMATION_SCHEMA.COLLATIONS was changed.

  The NOT NULL constraint was removed from these columns:
  - CHARACTER_SET_NAME
  - ID
  - IS_DEFAULT
  and from the corresponding columns in SHOW COLLATION.

  For example:

SELECT COLLATION_NAME, CHARACTER_SET_NAME, ID, IS_DEFAULT
FROM INFORMATION_SCHEMA.COLLATIONS
WHERE COLLATION_NAME LIKE '%uca1400_swedish_as_ci';
+-----------------------+--------------------+------+------------+
| COLLATION_NAME        | CHARACTER_SET_NAME | ID   | IS_DEFAULT |
+-----------------------+--------------------+------+------------+
| uca1400_swedish_as_ci | NULL               | NULL | NULL       |
+-----------------------+--------------------+------+------------+

  The NULL value in these columns now means that the collation
  is applicable to multiple character sets.
  The behavioir of old collations did not change.
  Make sure your client programs can handle NULL values in these columns.

- The structure of the table
  INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY was changed.

  Three new NOT NULL columns were added:
  - FULL_COLLATION_NAME
  - ID
  - IS_DEFAULT

  New collations have multiple entries in COLLATION_CHARACTER_SET_APPLICABILITY.
  The column COLLATION_NAME contains the collation name without the character
  set prefix. The column FULL_COLLATION_NAME contains the collation name with
  the character set prefix.

  Old collations have full collation name in both FULL_COLLATION_NAME and
  COLLATION_NAME.

SELECT COLLATION_NAME, FULL_COLLATION_NAME, CHARACTER_SET_NAME, ID, IS_DEFAULT
FROM INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY
WHERE FULL_COLLATION_NAME RLIKE '^(utf8mb4|latin1).*swedish.*ci$';
+-----------------------------+-------------------------------------+--------------------+------+------------+
| COLLATION_NAME              | FULL_COLLATION_NAME                 | CHARACTER_SET_NAME | ID   | IS_DEFAULT |
+-----------------------------+-------------------------------------+--------------------+------+------------+
| latin1_swedish_ci           | latin1_swedish_ci                   | latin1             |    8 | Yes        |
| latin1_swedish_nopad_ci     | latin1_swedish_nopad_ci             | latin1             | 1032 |            |
| utf8mb4_swedish_ci          | utf8mb4_swedish_ci                  | utf8mb4            |  232 |            |
| uca1400_swedish_ai_ci       | utf8mb4_uca1400_swedish_ai_ci       | utf8mb4            | 2368 |            |
| uca1400_swedish_as_ci       | utf8mb4_uca1400_swedish_as_ci       | utf8mb4            | 2370 |            |
| uca1400_swedish_nopad_ai_ci | utf8mb4_uca1400_swedish_nopad_ai_ci | utf8mb4            | 2372 |            |
| uca1400_swedish_nopad_as_ci | utf8mb4_uca1400_swedish_nopad_as_ci | utf8mb4            | 2374 |            |
+-----------------------------+-------------------------------------+--------------------+------+------------+

- Other INFORMATION_SCHEMA queries:

  SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.COLUMNS;
  SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.PARAMETERS;
  SELECT TABLE_COLLATION FROM INFORMATION_SCHEMA.TABLES;
  SELECT DEFAULT_COLLATION_NAME FROM INFORMATION_SCHEMA.SCHEMATA;
  SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.ROUTINES;
  SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.EVENTS;
  SELECT DATABASE_COLLATION FROM INFORMATION_SCHEMA.EVENTS;
  SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.ROUTINES;
  SELECT DATABASE_COLLATION FROM INFORMATION_SCHEMA.ROUTINES;
  SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.TRIGGERS;
  SELECT DATABASE_COLLATION FROM INFORMATION_SCHEMA.TRIGGERS;
  SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.VIEWS;

  display full collation names, including character sets prefix,
  for all collations, including new collations.

  Corresponding SHOW commands also display full collation names
  in collation related columns:

  SHOW CREATE TABLE t1;
  SHOW CREATE DATABASE db1;
  SHOW TABLE STATUS;
  SHOW CREATE FUNCTION f1;
  SHOW CREATE PROCEDURE p1;
  SHOW CREATE EVENT ev1;
  SHOW CREATE TRIGGER tr1;
  SHOW CREATE VIEW;

  These INFORMATION_SCHEMA queries and SHOW statements may change in
  the future, to display show collation names.
2022-08-10 15:04:24 +02:00
Marko Mäkelä
57a9626fe4 Merge 10.5 into 10.6 2022-05-06 11:11:04 +03:00
Alexander Barkov
821808c45d A clean-up for "MDEV-19772 Add helper classes for ST_FIELD_INFO"
As agreed with Serg, renaming class Yesno to Yes_or_empty,
to reflect better its behavior.

This helper class is used to define INFORMATION_SCHEMA columns
that return either "Yes" or an empty string.
2022-04-28 12:12:16 +04:00
Oleksandr Byelkin
f5c5f8e41e Merge branch '10.5' into 10.6 2022-02-03 17:01:31 +01:00
Alexander Barkov
62e320c86d MDEV-18918 SQL mode EMPTY_STRING_IS_NULL breaks RBR upon CREATE TABLE .. SELECT
The 10.5 version of the patch.

Removing DEFAULT from INFORMATION_SCHEMA columns.
DEFAULT in read-only tables is rather meaningless.
Upgrade should go smoothly.

Also fixes:
 MDEV-20254 Problems with EMPTY_STRING_IS_NULL and I_S tables
2022-01-25 10:31:55 +04:00
Monty
fa7d4abf16 Added typedef decimal_digits_t (uint16) for number of digits in most
aspects of decimals and integers

For fields and Item's uint8 should be good enough. After
discussions with Alexander Barkov we choose uint16 (for now)
as some format functions may accept +256 digits.

The reason for this patch was to make the usage and storage of decimal
digits simlar. Before this patch decimals was stored/used as uint8,
int and uint.  The lengths for numbers where also using a lot of
different types.

Changed most decimal variables and functions to use the new typedef.

squash! af7f09106b6c1dc20ae8c480bff6fd22d266b184

Use decimal_digits_t for all aspects of digits (total, precision
and scale), both for decimals and integers.
2021-05-19 22:27:27 +02:00
Alexander Barkov
daf28db027 MDEV-21537 InnoDB INFORMATION_SCHEMA tables fail to define DEFAULT for ENUM NOT NULL
- Adding a way to define I_S columns without DEFAULT.
- Fixing all ENUM columns in I_S.INNODB* table definitions to have no DEFAULT.
2020-01-22 08:04:53 +04:00
Alexander Barkov
6073049a36 MDEV-20353 Add separate type handlers for unsigned integer data types 2019-08-15 21:53:24 +04:00
Alexander Barkov
6f3612fa4d MDEV-19861 Add intfastructure to have ENUM columns in INFORMATION_SCHEMA 2019-06-26 06:46:55 +04:00
Alexander Barkov
c62eaa7bdf MDEV-19843 Modify ST_FIELD_INFO to use Type_handler and LEX_CSTRING 2019-06-24 06:25:16 +04:00
Alexander Barkov
9561b0b47e MDEV-19810 Reuse new I_S table definition helper classes for InnoDB 2019-06-20 11:40:33 +04:00
Alexander Barkov
4156b1a260 MDEV-19772 Add helper classes for ST_FIELD_INFO 2019-06-16 19:03:47 +04:00