The task "MDEV-25829 Change default Unicode collation to uca1400_ai_ci"
previously changed collation derivation for string user variables
from DERIVATION_EXPLICIT to DERIVATION_COERCIBLE, to resolve illegal
collation mix conflicts between table columns and user variables
when they have different collations.
However, DERIVATION_COERCIBLE was a wrong choice because it caused
conflicts between string literals and user variables when they have
different collations.
Adding a new collation derivation level DERIVATION_USERVAR.
This makes the collation of a user variable:
- weaker than a table column (like it was intended by MDEV-25829)
- but stronger than a literal (like it was in pre-MDEV-25829)
Cleanup in sql_type.h:
Removing the line "- BINARY(expr)" from the before-DERIVATION_CAST
comment, as it was on a wrong place. It's also listed on the correct
place before DERIVATION_IMPLICIT.
Step#2 - Adding a new collation derivation level for CAST and CONVERT.
Now character string cast functions:
- CAST(string_expr AS CHAR)
- CONVERT(expr USING charset_name)
have a new collation derivation level between:
- string literals
- utf8 metadata functions, e.g. user() and database()
Before the change these cast functions had collation derivation equal
to table columns, which caused more illegal mix of collation conflicts.
Note, binary string cast functions:
- BINARY(expr)
- CAST(string_expr AS BINARY)
- CONVERT(expr USING binary)
did not change their collation derivation, to preserve the behaviour of
queries like these:
SELECT database()=BINARY'test';
SELECT user()=CAST('root' AS BINARY);
SELECT current_role()=CONVERT('role' USING binary);
Derivation levels after the change look as follows:
DERIVATION_IGNORABLE= 7, // Explicit NULL
DERIVATION_NUMERIC= 6, // Numbers in string context,
// Numeric user variables
// CAST(numeric_expr AS CHAR)
DERIVATION_COERCIBLE= 5, // Literals, string user variables
DERIVATION_CAST= 4, // CAST(string_expr AS CHAR),
// CONVERT(string_expr USING cs)
DERIVATION_SYSCONST= 3, // utf8 metadata functions, e.g. user(), database()
DERIVATION_IMPLICIT= 2, // Table columns, SP variables, BINARY(expr)
DERIVATION_NONE= 1, // A mix (e.g. CONCAT) of two differrent collations
DERIVATION_EXPLICIT= 0 // An explicit COLLATE clause
when an internal temporary table field is created from a real field,
a new temp field should only copy a default from the source field
when the latter has it
This patch changes the main name of 3 byte character set from utf8 to
utf8mb3. New old_mode UTF8_IS_UTF8MB3 is added and set TRUE by default,
so that utf8 would mean utf8mb3. If not set, utf8 would mean utf8mb4.
This can cause crashes when accessing already released memory
The issue was the Item_default created a internal field, pointing to
share->default_values, to be used with the DEFAULT() function.
This does not work for BLOB fields as these are freed at end of query.
Fixed by storing BLOB field data inside and area allocated by
Item_default_value, like we do for nondeterministic default values.