For DECIMAL[(M[,D])] datatype max_sort_length was not being honoured which was leading to buffer
overflow while making the sort key. The fix to this problem would be to create sort keys for decimals
with atmost max_sort_key bytes
Important:
The minimum value of max_sort_length has been raised to 8 (previously was 4),
so fixed size datatypes like DOUBLE and BIGINIT are not truncated for
lower values of max_sort_length.
When neither MSAN nor Valgrind are enabled, declare
Field::mark_unused_memory_as_defined() as an empty inline function,
instead of declaring it as a virtual function.
MDEV-22073 MSAN use-of-uninitialized-value in
collect_statistics_for_table()
Other things:
innodb.analyze_table was changed to mainly test statistic
collection. Was discussed with Marko.
- Fixed mysql_prepare_create_table() constraint duplicate checking;
- Refactored period constraint handling in mysql_prepare_alter_table():
* No need to allocate new objects;
* Keep old constraint name but exclude it from dup checking by automatic_name;
- Some minor memory leaks fixed;
- Some conceptual TODOs.
compare_keys_but_name(): do not use KEY_PART_INFO::field for
Field::is_equal(). Following the logic of that code we need to
compare fields of a table. But KEY_PART_INFO::field sometimes
(when key part is shorter than table field) is a different field.
In that case Field::is_equal() returns incorrect result and
problems occur.
KEY_PART_INFO::field may become some strange field in
open_frm_error open_table_from_share(). I think this is an
incorrect logic, some tecnhical debt. I'm not fixing it right now,
because I don't have time. But I'm making Field::field_length
a const class member. Then, the only fishy code which changed that
field requires now a const_cast<>. I'm bringing attention to that
code with it. This change should not affect logic of the
program in any way.
MDEV-20589: Server still crashes in Field::set_warning_truncated_wrong_value
- Use dbug_tmp_use_all_columns() to mark that all fields can be used
- Remove field->is_stat_field (not needed)
- Remove extra arguments to Field::clone() that should not be there
- Safety fix for Field::set_warning_truncated_wrong_value() to not crash
if table is zero in production builds (We have got crashes several times
here so better to be safe than sorry).
- Threat wrong character string warnings identical to other field
conversion warnings. This removes some warnings we before got from
internal conversion errors. There is no good reason why a user would
get an error in case of 'key_field='wrong-utf8-string' but not for
'field=wrong-utf8-string'. The old code could also easily give
thousands of no-sence warnings for one single statement.
The flag is_stat_field is not set for the min_value and max_value of field items
inside table share. This is a must requirement as we don't want to throw
warnings of truncation when we read values from the statistics table to the column
statistics of table share fields.
Conversion to a temporal data type resulting into a lower precision
depends on TIME_ROUND_FRACTIONAL. Taking into account this dependency in:
- indexed generated virtual column expressions
- persistent virtual column expressions
A warning is now issued if conversion from the generation expression
to the column data type depends on TIME_ROUND_FRACTIONAL.
The warning will be changed to error in 10.5
remove a special treatment of a bare DEFAULT keyword that made it
behave inconsistently and differently from DEFAULT(column).
Now all forms of the explicit assignment of a default column value
behave identically, and all count as an explicitly assigned value
(for the purpose of ON UPDATE NOW).
followup for c7c481f4d9
* remove one level of virtual functions
* remove redundant checks
* remove an if() as the value is always known at compilation time
don't pretend that "DEFAULT expr" and "ON UPDATE DEFAULT NOW"
are "basically the same thing"
This patch allows the server to open old tables that have
"bad" generated columns (i.e. indexed virtual generated columns,
persistent generated columns) that depend on sql_mode,
for general things like SELECT, INSERT, DROP, etc.
Warning are issued in such cases.
Only these commands are now disallowed and return an error:
- CREATE TABLE introducing a "bad" generated column
- ALTER TABLE introducing a "bad" generated column
- CREATE INDEX introdicing a "bad" generated column
(i.e. adding an index on a virtual generated column
that depends on sql_mode).
Note, these commands are allowed:
- ALTER TABLE removing a "bad" generate column
- ALTER TABLE removing an index from a "bad" virtual generated column
- DROP INDEX removing an index from a "bad" virtual generated column
but only if the table does not have any "bad" columns as a result.
This change takes into account a column's GENERATED ALWAYS AS
expression dependcy on sql_mode's PAD_CHAR_TO_FULL_LENGTH and
NO_UNSIGNED_SUBTRACTION flags.
Indexed virtual columns as well as persistent generated columns are
now not allowed to have such dependencies to avoid inconsistent data
or index files on sql_mode changes.
So an error is now returned in cases like this:
CREATE OR REPLACE TABLE t1
(
a CHAR(5),
v VARCHAR(5) AS (a) PERSISTENT -- CHAR->VARCHAR or CHAR->TEXT = ERROR
);
Functions RPAD() and RTRIM() can now remove dependency on
PAD_CHAR_TO_FULL_LENGTH. So this can be used instead:
CREATE OR REPLACE TABLE t1
(
a CHAR(5),
v VARCHAR(5) AS (RTRIM(a)) PERSISTENT
);
Note, unlike CHAR->VARCHAR and CHAR->TEXT this still works,
not RPAD(a) is needed:
CREATE OR REPLACE TABLE t1
(
a CHAR(5),
v CHAR(5) AS (a) PERSISTENT -- CHAR->CHAR is OK
);
More sql_mode flags may affect values of generated columns.
They will be addressed separately.
See comments in sql_mode.h for implementation details.
With --skip-debug-assert, DBUG_ASSERT(false) will allow execution to
continue. Hence, we will need /* fall through */ after them.
Some DBUG_ASSERT(0) were replaced by break; when the switch () statement
was followed by DBUG_ASSERT(0).
Cherry picking:
Bug#25135304: RBR: WRONG FIELD LENGTH IN ERROR MESSAGE
commit 47bd3f7cf3c8518f62b1580ec65af2ba7ac13b95
Description:
============
In row based replication, when replicating from a table with a field with
character set set to UTF8mb3 to the same table with the same field set to
character set UTF8mb4 I get a confusing error message:
For VARCHAR: VARCHAR(1) 'utf8mb3' to VARCHAR(1) 'utf8mb4'
"Column 0 of table 'test.t1' cannot be converted from type 'varchar(3)' to
type 'varchar(1)'"
Similar issue with CHAR type as well.
Issue with respect to BLOB types:
For BLOB: LONGBLOB to TINYBLOB - Error message displays incorrect blob type.
"Column 0 of table 'test.t1' cannot be converted from type 'tinyblob' to type
'tinyblob'"
For BINARY to BINARY - Error message displays incorrect type for master side
field.
"Column 0 of table 'test.t' cannot be converted from type 'char(1)' to type
'binary(10)'"
Similar issue exists for VARBINARY type. It is displayed as 'VARCHAR'.
Analysis:
=========
In Row based replication charset information is not sent as part of metadata
from master to slave.
For VARCHAR field its character length is converted into equivalent
octets/bytes and stored internally. At the time of displaying the data to user
it is converted back to original character length.
For example:
VARCHAR(2)- utf8mb3 is stored as:2*3 = VARCHAR(6)
At the time of displaying it to user
VARCHAR(6)- charset utf8mb3:6/3= VARCHAR(2).
At present the internally converted octect length is sent from master to slave
with out providing the charset information. On slave side if the type
conversion fails 'show_sql_type' function is used to get the type specific
information from metadata. Since there is no charset information is available
the filed type is displayed as VARCHAR(6).
This results in confused error message.
For CHAR fields
CHAR(1)- utf8mb3 - CHAR(3)
CHAR(1)- utf8mb4 - CHAR(4)
'show_sql_type' function which retrieves type information from metadata uses
(bytes/local charset length) to get actual character length. If slave's chaset
is 'utf8mb4' then
CHAR(3/4)-->CHAR(0)
CHAR(4/4)-->CHAR(1).
This results in confused error message.
Analysis for BLOB type issue:
BLOB's length is represented in two forms.
1. Actual length
i.e
(length < 256) type= MYSQL_TYPE_TINY_BLOB;
(length < 65536) type= MYSQL_TYPE_BLOB; ...
2. packlength - The number of bytes used to represent the length of the blob
1- tinyblob
2- blob ...
In row based replication only the packlength is written in the binary log. On
the slave side this packlength is interpreted as actual length of the blob.
Hence the length is always < 256 and the type is displayed as tiny blob.
Analysis for BINARY to BINARY type issue:
The character set information is needed to identify a filed's type as char or
binary. Since master side character set information is not available on the
slave side both binary and char fields are displayed as char.
Fix:
===
For CHAR and VARCHAR fields display their length in octets for both source and
target fields. For target field display the charset information if it is
relevant.
For blob type changed the code to use the packlength and display appropriate
blob type in error message.
For binary and varbinary fields use the slave side character set as reference
to map them to binary or varbinary fields.
MDEV-19486 and one more similar bug appeared because handler::write_row() interface
welcomes to modify buffer by storage engine. But callers are not ready for that
thus bugs are possible in future.
handler::write_row():
handler::ha_write_row(): make argument const
cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug
Maintainer mode makes all warnings errors. This patch fix warnings. Mostly about
deprecated `register` keyword.
Too much warnings came from Mroonga and I gave up on it.
Make Field::is_equal() const and return bool as it's a naturally fitting
type for it. Also it's agrument was narrowed to Column_definition.
InnoDB can change type of some columns by itself. InnoDB-specific code used to
reside in Field_xxx:is_equal() methods. Now engine-specific stuff was
moved to a virtual methods of handler::can_convert{string,varstring,blob,geom}.
These methods are called by Field::can_be_converted_by_engine() which is a
double dispatch pattern.
Some InnoDB-specific code still resides in compare_keys_but_name(). It should
be moved from here someday to handler::compare_key_parts(...) or similar.
IS_EQUAL_WITH_REINTERPRET_COMPATIBLE_CHARSET
IS_EQUAL_WITH_REINTERPRET_COMPATIBLE_CHARSET_BUT_COLLATE: both was removed
IS_EQUAL_NO, IS_EQUAL_YES are not needed now and should be removed
along with deprecated handler::check_if_incompatible_data().
HA_EXTENDED_TYPES_CONVERSION: was removed as such logic is not needed now by
server code.
ALTER_COLUMN_EQUAL_PACK_LENGTH: was renamed to a more generic
ALTER_COLUMN_TYPE_CHANGE_BY_ENGINE
Patch is about two cases:
1) On some collate changes it's possible to rebuild only secondary indexes
2) For non-indexed columns collate can be changed INSTANTly
Implemented mostly in Field_{string,varstring,blob}::is_equal().
Make this method return how exactly collationa differs.
This information is later used by fill_alter_inplace_info() to pass
correct info to engine.
In collaboration with Sergey Vojtovich <svoj@mariadb.org>
The COMPRESSED clause is now a part of the data type and goes immediately
after the data type and length, but before the CHARACTER SET clause,
and before column attributes such as DEFAULT, COLLATE, ON UPDATE,
SYSTEM VERSIONING, engine specific column attributes.
In the old reduction, the COMPRESSED clause was a column attribute.
New syntax:
<varchar or text data type> <length> <compression> <character set> <column attributes>
<varbinary or blob data type> <length> <compression> <column attributes>
New syntax examples:
VARCHAR(1000) COMPRESSED CHARACTER SET latin1 DEFAULT ''
BLOB COMPRESSED DEFAULT ''
Deprecate syntax examples:
VARCHAR(1000) CHARACTER SET latin1 COMPRESSED DEFAULT ''
TEXT CHARACTER SET latin1 DEFAULT '' COMPRESSED
VARBINARY(1000) DEFAULT '' COMPRESSED
As a side effect:
- COMPRESSED is not valid as an SP label name in SQL/PSM routines any more
(but it's still valid as an SP label name in sql_mode=ORACLE)
- COMPRESSED is now allowed in combination with GENERATED ALWAYS AS:
TEXT COMPRESSED GENERATED ALWAYS AS REPEAT('a',1000)
Introduced a print_key_value function to makes sure that the trace prints data in readable format
for readable characters and the rest of the characters are printed as hexadecimal.
This patch fixes:
- MDEV-19284 INSTANT ALTER with ucs2-to-utf16 conversion produces bad data
- MDEV-19285 INSTANT ALTER from ascii_general_ci to latin1_general_ci produces corrupt data
These regressions were introduced in 10.4.3 by:
- MDEV-15564 Avoid table rebuild in ALTER TABLE on collation or charset changes
Changes:
1. Cleanup: Adding a helper method
Field_longstr::csinfo_change_allows_instant_alter(),
to remove some duplicate code in field.cc.
2. Cleanup: removing Type_handler::Charsets_are_compatible() and static
function charsets_are_compatible() and
introducing new methods in the recently added class Charset instead:
- encoding_allows_reinterpret_as()
- encoding_and_order_allow_reinterpret_as()
3. Bug fix: Removing the code that allowed instant conversion for
ascii-to->8bit and ucs2-to->utf16.
This actually fixes MDEV-19284 and MDEV-19285.
4. Bug fix: Adding a helper method Charset::collation_specific_name().
The old corresponding code in Type_handler::Charsets_are_compatible()
was not safe against (badly named) user-defined collations whose
character set name can be longer than collation name.
Field_bit for BIT(20) uses 2 full bytes in the record,
with additional 4 uneven bits in the "null bit area".
Field::set_default() called from Field_bit::set_default() erroneously
copied 3 bytes instead of 2 bytes from the record with default values.
Changing Field::set_default() to copy pack_length_in_rec() bytes
instead of pack_length() bytes.
Adding new virtual methods in Field:
- make_empty_rec_store_default_value()
- make_empty_rec_reset()
This simplifies the logic for every Field type,
and makes the code more friendly to pluggable data types.