Commit graph

4 commits

Author SHA1 Message Date
Alexander Barkov
fd247cc21f MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp()
This patch also fixes:
  MDEV-33050 Build-in schemas like oracle_schema are accent insensitive
  MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
  MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
  MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
  MDEV-33088 Cannot create triggers in the database `MYSQL`
  MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
  MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
  MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
  MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
  MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0

- Removing the virtual function strnncoll() from MY_COLLATION_HANDLER

- Adding a wrapper function CHARSET_INFO::streq(), to compare
  two strings for equality. For now it calls strnncoll() internally.
  In the future it will turn into a virtual function.

- Adding new accent sensitive case insensitive collations:
    - utf8mb4_general1400_as_ci
    - utf8mb3_general1400_as_ci
  They implement accent sensitive case insensitive comparison.
  The weight of a character is equal to the code point of its
  upper case variant. These collations use Unicode-14.0.0 casefolding data.

  The result of
     my_charset_utf8mb3_general1400_as_ci.strcoll()
  is very close to the former
     my_charset_utf8mb3_general_ci.strcasecmp()

  There is only a difference in a couple dozen rare characters, because:
    - the switch from "tolower" to "toupper" comparison, to make
      utf8mb3_general1400_as_ci closer to utf8mb3_general_ci
    - the switch from Unicode-3.0.0 to Unicode-14.0.0
  This difference should be tolarable. See the list of affected
  characters in the MDEV description.

  Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters!
  Unlike utf8mb4_general_ci, it does not treat all BMP characters
  as equal.

- Adding classes representing names of the file based database objects:

    Lex_ident_db
    Lex_ident_table
    Lex_ident_trigger

  Their comparison collation depends on the underlying
  file system case sensitivity and on --lower-case-table-names
  and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci.

- Adding classes representing names of other database objects,
  whose names have case insensitive comparison style,
  using my_charset_utf8mb3_general1400_as_ci:

  Lex_ident_column
  Lex_ident_sys_var
  Lex_ident_user_var
  Lex_ident_sp_var
  Lex_ident_ps
  Lex_ident_i_s_table
  Lex_ident_window
  Lex_ident_func
  Lex_ident_partition
  Lex_ident_with_element
  Lex_ident_rpl_filter
  Lex_ident_master_info
  Lex_ident_host
  Lex_ident_locale
  Lex_ident_plugin
  Lex_ident_engine
  Lex_ident_server
  Lex_ident_savepoint
  Lex_ident_charset
  engine_option_value::Name

- All the mentioned Lex_ident_xxx classes implement a method streq():

  if (ident1.streq(ident2))
     do_equal();

  This method works as a wrapper for CHARSET_INFO::streq().

- Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name"
  in class members and in function/method parameters.

- Replacing all calls like
    system_charset_info->coll->strcasecmp(ident1, ident2)
  to
    ident1.streq(ident2)

- Taking advantage of the c++11 user defined literal operator
  for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h)
  data types. Use example:

  const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column;

  is now a shorter version of:

  const Lex_ident_column primary_key_name=
    Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
2024-04-18 15:22:10 +04:00
Alexander Barkov
2b6d241ee4 MDEV-27744 LPAD in vcol created in ORACLE mode makes table corrupted in non-ORACLE
The crash happened with an indexed virtual column whose
value is evaluated using a function that has a different meaning
in sql_mode='' vs sql_mode=ORACLE:

- DECODE()
- LTRIM()
- RTRIM()
- LPAD()
- RPAD()
- REPLACE()
- SUBSTR()

For example:

CREATE TABLE t1 (
  b VARCHAR(1),
  g CHAR(1) GENERATED ALWAYS AS (SUBSTR(b,0,0)) VIRTUAL,
  KEY g(g)
);

So far we had replacement XXX_ORACLE() functions for all mentioned function,
e.g. SUBSTR_ORACLE() for SUBSTR(). So it was possible to correctly re-parse
SUBSTR_ORACLE() even in sql_mode=''.

But it was not possible to re-parse the MariaDB version of SUBSTR()
after switching to sql_mode=ORACLE. It was erroneously mis-interpreted
as SUBSTR_ORACLE().

As a result, this combination worked fine:

SET sql_mode=ORACLE;
CREATE TABLE t1 ... g CHAR(1) GENERATED ALWAYS AS (SUBSTR(b,0,0)) VIRTUAL, ...;
INSERT ...
FLUSH TABLES;
SET sql_mode='';
INSERT ...

But the other way around it crashed:

SET sql_mode='';
CREATE TABLE t1 ... g CHAR(1) GENERATED ALWAYS AS (SUBSTR(b,0,0)) VIRTUAL, ...;
INSERT ...
FLUSH TABLES;
SET sql_mode=ORACLE;
INSERT ...

At CREATE time, SUBSTR was instantiated as Item_func_substr and printed
in the FRM file as substr(). At re-open time with sql_mode=ORACLE, "substr()"
was erroneously instantiated as Item_func_substr_oracle.

Fix:

The fix proposes a symmetric solution. It provides a way to re-parse reliably
all sql_mode dependent functions to their original CREATE TABLE time meaning,
no matter what the open-time sql_mode is.

We take advantage of the same idea we previously used to resolve sql_mode
dependent data types.

Now all sql_mode dependent functions are printed by SHOW using a schema
qualifier when the current sql_mode differs from the function sql_mode:

SET sql_mode='';
CREATE TABLE t1 ... SUBSTR(a,b,c) ..;
SET sql_mode=ORACLE;
SHOW CREATE TABLE t1;   ->   mariadb_schema.substr(a,b,c)

SET sql_mode=ORACLE;
CREATE TABLE t2 ... SUBSTR(a,b,c) ..;
SET sql_mode='';
SHOW CREATE TABLE t1;   ->   oracle_schema.substr(a,b,c)

Old replacement names like substr_oracle() are still understood for
backward compatibility and used in FRM files (for downgrade compatibility),
but they are not printed by SHOW any more.
2023-11-08 15:01:20 +04:00
Alexander Barkov
ddcc9d2281 MDEV-31153 New methods Schema::make_item_func_* for REPLACE, SUBSTRING, TRIM
Adding virtual methods to class Schema:

  make_item_func_replace()
  make_item_func_substr()
  make_item_func_trim()

This is a non-functional preparatory change for MDEV-27744.
2023-04-29 08:06:46 +04:00
Alexander Barkov
d63631c3fa MDEV-19632 Replication aborts with ER_SLAVE_CONVERSION_FAILED upon CREATE ... SELECT in ORACLE mode
- Adding optional qualifiers to data types:
    CREATE TABLE t1 (a schema.DATE);
  Qualifiers now work only for three pre-defined schemas:

    mariadb_schema
    oracle_schema
    maxdb_schema

  These schemas are virtual (hard-coded) for now, but may turn into real
  databases on disk in the future.

- mariadb_schema.TYPE now always resolves to a true MariaDB data
  type TYPE without sql_mode specific translations.

- oracle_schema.DATE translates to MariaDB DATETIME.

- maxdb_schema.TIMESTAMP translates to MariaDB DATETIME.

- Fixing SHOW CREATE TABLE to use a qualifier for a data type TYPE
  if the current sql_mode translates TYPE to something else.

The above changes fix the reported problem, so this script:

    SET sql_mode=ORACLE;
    CREATE TABLE t2 AS SELECT mariadb_date_column FROM t1;

is now replicated as:

    SET sql_mode=ORACLE;
    CREATE TABLE t2 (mariadb_date_column mariadb_schema.DATE);

and the slave can unambiguously treat DATE as the true MariaDB DATE
without ORACLE specific translation to DATETIME.

Similar,

    SET sql_mode=MAXDB;
    CREATE TABLE t2 AS SELECT mariadb_timestamp_column FROM t1;

is now replicated as:

    SET sql_mode=MAXDB;
    CREATE TABLE t2 (mariadb_timestamp_column mariadb_schema.TIMESTAMP);

so the slave treats TIMESTAMP as the true MariaDB TIMESTAMP
without MAXDB specific translation to DATETIME.
2020-08-01 07:43:50 +04:00