MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range
- Added --update-history option to mariadb-dump to change 2038
row_end timestamp to 2106.
- Updated ALTER TABLE ... to convert old row_end timestamps to
2106 timestamp for tables created before MariaDB 11.4.0.
- Fixed bug in CHECK TABLE where we wrongly suggested to USE REPAIR
TABLE when ALTER TABLE...FORCE is needed.
- mariadb-check printed table names that where used with REPAIR TABLE but
did not print table names used with ALTER TABLE or with name repair.
Fixed by always printing a table that is fixed if --silent is not
used.
- Added TABLE::vers_fix_old_timestamp() that will change max-timestamp
for versioned tables when replication from a pre-11.4.0 server.
A few test cases changed. This is caused by:
- CHECK TABLE now prints 'Please do ALTER TABLE... instead of
'Please do REPAIR TABLE' when there is a problem with the information
in the .frm file (for example a very old frm file).
- mariadb-check now prints repaired table names.
- mariadb-check also now prints nicer error message in case ALTER TABLE
is needed to repair a table.
This task is to ensure we have a clear definition and rules of how to
repair or optimize a table.
The rules are:
- REPAIR should be used with tables that are crashed and are
unreadable (hardware issues with not readable blocks, blocks with
'unexpected data' etc)
- OPTIMIZE table should be used to optimize the storage layout for the
table (recover space for delete rows and optimize the index
structure.
- ALTER TABLE table_name FORCE should be used to rebuild the .frm file
(the table definition) and the table (with the original table row
format). If the table is from and older MariaDB/MySQL release with a
different storage format, it will convert the data to the new
format. ALTER TABLE ... FORCE is used as part of mariadb-upgrade
Here follows some more background:
The 3 ways to repair a table are:
1) ALTER TABLE table_name FORCE" (not other options).
As an alias we allow: "ALTER TABLE table_name ENGINE=original_engine"
2) "REPAIR TABLE" (without FORCE)
3) "OPTIMIZE TABLE"
All of the above commands will optimize row space usage (which means that
space will be needed to hold a temporary copy of the table) and
re-generate all indexes. They will also try to replicate the original
table definition as exact as possible.
For ALTER TABLE and "REPAIR TABLE without FORCE", the following will hold:
If the table is from an older MariaDB version and data conversion is
needed (for example for old type HASH columns, MySQL JSON type or new
TIMESTAMP format) "ALTER TABLE table_name FORCE, algorithm=COPY" will be
used.
The differences between the algorithms are
1) Will use the fastest algorithm the engine supports to do a full repair
of the table (except if data conversions are is needed).
2) Will use the storage engine internal REPAIR facility (MyISAM, Aria).
If the engine does not support REPAIR then
"ALTER TABLE FORCE, ALGORITHM=COPY" will be used.
If there was data incompatibilities (which means that FORCE was used)
then there will be a warning after REPAIR that ALTER TABLE FORCE is
still needed.
The reason for this is that REPAIR may be able to go around data
errors (wrong incompatible data, crashed or unreadable sectors) that
ALTER TABLE cannot do.
3) Will use the storage engine internal OPTIMIZE. If engine does not
support optimize, then "ALTER TABLE FORCE" is used.
The above will ensure that ALTER TABLE FORCE is able to
correct almost any errors in the row or index data. In case of
corrupted blocks then REPAIR possible followed by ALTER TABLE is needed.
This is important as mariadb-upgrade executes ALTER TABLE table_name
FORCE for any table that must be re-created.
Bugs fixed with InnoDB tables when using ALTER TABLE FORCE:
- No error for INNODB_DEFAULT_ROW_FORMAT=COMPACT even if row length
would be too wide. (Independent of innodb_strict_mode).
- Tables using symlinks will be symlinked after any of the above commands
(Independent of the setting of --symbolic-links)
If one specifies an algorithm together with ALTER TABLE FORCE, things
will work as before (except if data conversion is required as then
the COPY algorithm is enforced).
ALTER TABLE .. OPTIMIZE ALL PARTITIONS will work as before.
Other things:
- FORCE argument added to REPAIR to allow one to first run internal
repair to fix damaged blocks and then follow it with ALTER TABLE.
- REPAIR will not update frm_version if ha_check_for_upgrade() finds
that table is still incompatible with current version. In this case the
REPAIR will end with an error.
- REPAIR for storage engines that does not have native repair, like InnoDB,
is now using ALTER TABLE FORCE.
- REPAIR csv-table USE_FRM now works.
- It did not work before as CSV tables had extension list in wrong
order.
- Default error messages length for %M increased from 128 to 256 to not
cut information from REPAIR.
- Documented HA_ADMIN_XX variables related to repair.
- Added HA_ADMIN_NEEDS_DATA_CONVERSION to signal that we have to
do data conversions when converting the table (and thus ALTER TABLE
copy algorithm is needed).
- Fixed typo in error message (caused test changes).
MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range
- Changed usage of timeval to my_timeval as the timeval parts on windows
are 32-bit long, which causes some compiler issues on windows.
This patch extends the timestamp from
2038-01-19 03:14:07.999999 to 2106-02-07 06:28:15.999999
for 64 bit hardware and OS where 'long' is 64 bits.
This is true for 64 bit Linux but not for Windows.
This is done by treating the 32 bit stored int as unsigned instead of
signed. This is safe as MariaDB has never accepted dates before the epoch
(1970).
The benefit of this approach that for normal timestamp the storage is
compatible with earlier version.
However for tables using system versioning we before stored a
timestamp with the year 2038 as the 'max timestamp', which is used to
detect current values. This patch stores the new 2106 year max value
as the max timestamp. This means that old tables using system
versioning needs to be updated with mariadb-upgrade when moving them
to 11.4. That will be done in a separate commit.
Fixed that no tables from 'mysql' schema are included in userstat.
A beneif of this is that the server is not reading statistics tables
if mysql.proc or other tables in mysql is accessed.
on disable_indexes(HA_KEY_SWITCH_NONUNIQ_SAVE) the engine does
not know that the long unique is logically unique, because on the
engine level it is not. And the engine disables it,
Change the disable_indexes/enable_indexes API. Instead of the enum
mode, send a key_map of indexes that should be enabled. This way the
server will decide what is unique, not the engine.
It was possible to create virtual columns with incompatible
GENERATED ALWAYS expression data types:
CREATE TABLE t1 (a INT, b POINT GENERATED ALWAYS AS (a));
CREATE TABLE t1 (a POINT, b INT GENERATED ALWAYS AS (a));
These data type combinations are not allowed in other cases,
e.g. INSERT, UPDATE, SP variable assignment.
Fix:
Disallowing bad combinations of the column data type and its
GENERATED ALWAYS expression data type.
Fixing the problem that an operation involving a mix of
two or more GEOMETRY operands did not preserve their SRIDs.
Now SRIDs are preserved by hybrid functions, subqueries, TVCs, UNIONs, VIEWs.
Incompatible change:
An attempt to mix two different SRIDs now raises an error.
Details:
- Adding a new class Type_extra_attributes. It's a generic
container which can store very specific data type attributes.
For now it can store one uint32 and one const pointer attribute
(for GEOMETRY's SRID and for ENUM/SET TYPELIB respectively).
In the future it can grow as needed.
Type_extra_attributes will also be reused soon to store "const Type_zone*"
pointers for the TIMESTAMP's "WITH TIME ZONE 'tz'" attribute
(a timestamp data type with a fixed time zone independent from @@time_zone).
The time zone attribute will be stored in exactly the same way like
a TYPELIB pointer is stored by ENUM/SET.
- Removing Column_definition_attributes members "interval" and "srid".
Deriving Column_definition_attributes from the generic attribute container
Type_extra_attributes instead.
- Adding a new class Type_typelib_attributes, to store
the TYPELIB of the ENUM and SET data types. Deriving Field_enum from it.
Removing the member Field_enum::typelib.
- Adding a new class Type_geom_attributes, to store
the GEOMETRY related attributes. Deriving Field_geom from it.
Removing the member Field_geom::srid.
- Removing virtual methods:
Field::get_typelib()
Type_all_attributes::get_typelib() and
Type_all_attributes::set_typelib()
They were very specific to TYPELIB.
Adding more generic virtual methods instead:
* Field::type_extra_attributes() - to get extra attributes
* Type_all_attributes::type_extra_attributes() - to get extra attributes
* Type_all_attributes::type_extra_attributes_addr() - to set extra attributes
- Removing Item_type_holder::enum_set_typelib. Deriving Item_type_holder
from the generic attribute container Type_extra_attributes instead.
This makes it possible for UNION to preserve SRID
(in addition to preserving TYPELIB).
- Deriving Item_hybrid_func from Type_extra_attributes.
This makes it possible for hybrid functions (e.g. CASE, COALESCE,
LEAST, GREATEST etc) to preserve SRID.
- Deriving Item_singlerow_subselect from Type_extra_attributes and
overriding methods:
* Item_cache::type_extra_attributes()
* subselect_single_select_engine::fix_length_and_dec()
* Item_singlerow_subselect::type_extra_attributes()
* Item_singlerow_subselect::type_extra_attributes_addr()
This is needed to preserve SRID in subqueries and TVCs
- Cleanup: fixing the data type of members
* Binlog_type_info::m_enum_typelib
* Binlog_type_info::m_set_typelib
from "TYPELIB *" to "const TYPELIB *"
Fix the previous patch:
- Only enable handler_stats if thd->should_collect_handler_stats()==true.
- Make handler_index_cond_check() work when handler_stats are not enabled.
instance, for ICP accounting, which is created
during ha_handler_stats_reset(). Always invoke that method
during table init to ensure that the handler, regardless of
implementation, has the pointer set correctly
This patch also fixes:
MDEV-33050 Build-in schemas like oracle_schema are accent insensitive
MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
MDEV-33088 Cannot create triggers in the database `MYSQL`
MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0
- Removing the virtual function strnncoll() from MY_COLLATION_HANDLER
- Adding a wrapper function CHARSET_INFO::streq(), to compare
two strings for equality. For now it calls strnncoll() internally.
In the future it will turn into a virtual function.
- Adding new accent sensitive case insensitive collations:
- utf8mb4_general1400_as_ci
- utf8mb3_general1400_as_ci
They implement accent sensitive case insensitive comparison.
The weight of a character is equal to the code point of its
upper case variant. These collations use Unicode-14.0.0 casefolding data.
The result of
my_charset_utf8mb3_general1400_as_ci.strcoll()
is very close to the former
my_charset_utf8mb3_general_ci.strcasecmp()
There is only a difference in a couple dozen rare characters, because:
- the switch from "tolower" to "toupper" comparison, to make
utf8mb3_general1400_as_ci closer to utf8mb3_general_ci
- the switch from Unicode-3.0.0 to Unicode-14.0.0
This difference should be tolarable. See the list of affected
characters in the MDEV description.
Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters!
Unlike utf8mb4_general_ci, it does not treat all BMP characters
as equal.
- Adding classes representing names of the file based database objects:
Lex_ident_db
Lex_ident_table
Lex_ident_trigger
Their comparison collation depends on the underlying
file system case sensitivity and on --lower-case-table-names
and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci.
- Adding classes representing names of other database objects,
whose names have case insensitive comparison style,
using my_charset_utf8mb3_general1400_as_ci:
Lex_ident_column
Lex_ident_sys_var
Lex_ident_user_var
Lex_ident_sp_var
Lex_ident_ps
Lex_ident_i_s_table
Lex_ident_window
Lex_ident_func
Lex_ident_partition
Lex_ident_with_element
Lex_ident_rpl_filter
Lex_ident_master_info
Lex_ident_host
Lex_ident_locale
Lex_ident_plugin
Lex_ident_engine
Lex_ident_server
Lex_ident_savepoint
Lex_ident_charset
engine_option_value::Name
- All the mentioned Lex_ident_xxx classes implement a method streq():
if (ident1.streq(ident2))
do_equal();
This method works as a wrapper for CHARSET_INFO::streq().
- Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name"
in class members and in function/method parameters.
- Replacing all calls like
system_charset_info->coll->strcasecmp(ident1, ident2)
to
ident1.streq(ident2)
- Taking advantage of the c++11 user defined literal operator
for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h)
data types. Use example:
const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column;
is now a shorter version of:
const Lex_ident_column primary_key_name=
Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
Ideally our methods and functions should do one thing, do that well,
and do only that. add_table_to_list does far more than adding a
table to a list, so this commit factors the TABLE_LIST creation out
to a new TABLE_LIST constructor. It then uses placement new()
to create it in the correct memory area (result of thd->calloc).
Benefits of this approach:
1. add_table_to_list now returns as early as possible on an error
2. fewer side-effects incurred on creating the TABLE_LIST object
3. TABLE_LIST won't be calloc'd if copy_to_db fails
4. local declarations moved closer to their respective first uses
5. improved code readability and logical flow
Also factored a couple of other functions to keep the happy path
more to the left, which makes them easier to follow at a glance.
According to the standard, the autoincrement column (i.e. *identity
column*) should be advanced each insert implicitly made by
UPDATE/DELETE ... FOR PORTION.
This is very unconvenient use in several notable cases. Concider a
WITHOUT OVERLAPS key with an autoinc column:
id int auto_increment, unique(id, p without overlaps)
An update or delete with FOR PORTION creates a sense that id will remain
unchanged in such case.
The standard's IDENTITY reminds MariaDB's AUTO_INCREMENT, however
the generation rules differ in many ways. For example, there's also a
notion autoincrement index, which is bound to the autoincrement field.
We will define our own generation rule for the PORTION OF operations
involving AUTO_INCREMENT:
* If an autoincrement index contains WITHOUT OVERLAPS specification, then
a new value should not be generated, otherwise it should.
Apart from WITHOUT OVERLAPS there is also another notable case, referred
by the reporter - a unique key that has an autoincrement column and a field
from the period specification:
id int auto_increment, unique(id, s), period for p(s, e)
for this case, no exception is made, and the autoincrementing rules will be
proceeded accordung to the standard (i.e. the value will be advanced on
implicit inserts).
use the original, not the truncated, field in the long unique prefix,
that is, in the hash(left(field, length)) expression.
because MyISAM CHECK/REPAIR in compute_vcols() moves table->field
but not prefix fields from keyparts.
Also, implement Field_string::cmp_prefix() for prefix comparison
of CHAR columns to work.
Most things where wrong in the test suite.
The one thing that was a bug was that table_map_id was in some places
defined as ulong and in other places as ulonglong. On Linux 64 bit this
is not a problem as ulong == ulonglong, but on windows this caused failures.
Fixed by ensuring that all instances of table_map_id are ulonglong.