This patch also fixes:
MDEV-33050 Build-in schemas like oracle_schema are accent insensitive
MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
MDEV-33088 Cannot create triggers in the database `MYSQL`
MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0
- Removing the virtual function strnncoll() from MY_COLLATION_HANDLER
- Adding a wrapper function CHARSET_INFO::streq(), to compare
two strings for equality. For now it calls strnncoll() internally.
In the future it will turn into a virtual function.
- Adding new accent sensitive case insensitive collations:
- utf8mb4_general1400_as_ci
- utf8mb3_general1400_as_ci
They implement accent sensitive case insensitive comparison.
The weight of a character is equal to the code point of its
upper case variant. These collations use Unicode-14.0.0 casefolding data.
The result of
my_charset_utf8mb3_general1400_as_ci.strcoll()
is very close to the former
my_charset_utf8mb3_general_ci.strcasecmp()
There is only a difference in a couple dozen rare characters, because:
- the switch from "tolower" to "toupper" comparison, to make
utf8mb3_general1400_as_ci closer to utf8mb3_general_ci
- the switch from Unicode-3.0.0 to Unicode-14.0.0
This difference should be tolarable. See the list of affected
characters in the MDEV description.
Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters!
Unlike utf8mb4_general_ci, it does not treat all BMP characters
as equal.
- Adding classes representing names of the file based database objects:
Lex_ident_db
Lex_ident_table
Lex_ident_trigger
Their comparison collation depends on the underlying
file system case sensitivity and on --lower-case-table-names
and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci.
- Adding classes representing names of other database objects,
whose names have case insensitive comparison style,
using my_charset_utf8mb3_general1400_as_ci:
Lex_ident_column
Lex_ident_sys_var
Lex_ident_user_var
Lex_ident_sp_var
Lex_ident_ps
Lex_ident_i_s_table
Lex_ident_window
Lex_ident_func
Lex_ident_partition
Lex_ident_with_element
Lex_ident_rpl_filter
Lex_ident_master_info
Lex_ident_host
Lex_ident_locale
Lex_ident_plugin
Lex_ident_engine
Lex_ident_server
Lex_ident_savepoint
Lex_ident_charset
engine_option_value::Name
- All the mentioned Lex_ident_xxx classes implement a method streq():
if (ident1.streq(ident2))
do_equal();
This method works as a wrapper for CHARSET_INFO::streq().
- Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name"
in class members and in function/method parameters.
- Replacing all calls like
system_charset_info->coll->strcasecmp(ident1, ident2)
to
ident1.streq(ident2)
- Taking advantage of the c++11 user defined literal operator
for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h)
data types. Use example:
const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column;
is now a shorter version of:
const Lex_ident_column primary_key_name=
Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
We add an extra condition that makes the inequality testing in
SEQUENCE::increment_value() mathematically watertight, and we cast to
and from unsigned in potential underflow and overflow addition and
subtractions to avoid undefined behaviour.
Let's start by distinguishing between c++ expressions and mathematical
expressions. by c++ expression I mean an expression with the outcome
determined by the compiler/runtime. by mathematical expression I mean
an expression whose value is mathematically determined. So a c++
expression -9223372036854775806 - 1000 at worst can evaluate to any
value due to underflow. A mathematical expression -9223372036854775806
- 1000 evaluates to -9223372036854776806.
The problem boils down to how to write a c++ expression equivalent to
an mathematical expression x + y < z where x and z can take any values
of long long int, and y < 0 is also a long long int. Ideally we want
to avoid underflow, but I'm not sure how this can be done.
The correct c++ form should be (x + y < z || x < z - y || x < z).
Let M=9223372036854775808 i.e. LONGLONG_MAX + 1. We have
-M < x < M - 1
-M < y < 0
-M < z < M - 1
Let's consider the case where x + y < z is true as a mathematical
expression.
If the first disjunct underflows, i.e. the mathematical expression x
+ y < -M. If the arbitrary value resulting from the underflow causes
the c++ expression to hold too, then we are done. Otherwise we move
onto the next expression x < z - y. If there's no overflow in z
- y then we are done. If there's overflow i.e. z - y > M - 1,
and the c++ expression evals to false, then we are onto x < z.
There's no over or underflow here, and it will eval to true. To see
this, note that
x + y < -M means x < -M - y < -M - (-M) = 0
z - y > M - 1 means z > y + M - 1 > - M + M - 1 = -1
so x < z.
Now let's consider the case where x + y < z is false as a mathematical
expression.
The first disjunct will not underflow in this case, so we move to (x <
z - y). This will not overflow. To see this, note that
x + y >= z means z - y <= x < M - 1
So it evals to false too. And the third disjunct x < z also evals to
false because x >= z - y > z.
I suspect that in either case the expression x < z does not determine
the final value of the disjunction in the vast majority cases, which
is why we leave it as the final one in case of the rare cases of both
an underflow and an overflow happening.
Here's an example of both underflow and overflow happening and the
added inequality x < z saves the day:
x = - M / 2
y = - M / 2 - 1
z = M / 2
x + y evals to M - 1 which is > z
z - y evals to - M + 1 which is < x
We can do the same to test x + y > z where the increment y is positive:
(x > z - y || x + y > z || x > z)
And the same analysis applies to unsigned cases.
- Add `as <int_type>` to sequence creation options
- int_type can be signed or unsigned integer types, including
tinyint, smallint, mediumint, int and bigint
- Limitation: when alter sequence as <new_int_type>, cannot have any
other alter options in the same statement
- Limitation: increment remains signed longlong, and the hidden
constraint (cache_size x abs(increment) < longlong_max) stays for
unsigned types. This means for bigint unsigned, neither
abs(increment) nor (cache_size x abs(increment)) can be between
longlong_max and ulonglong_max
- Truncating maxvalue and minvalue from user input to the nearest max
or min value of the type, plus or minus 1. When the truncation
happens, a warning is emitted
- Information schema table for sequences
Problem is that Galera starts TOI (total order isolation) i.e.
it sends query to all nodes. Later it is discovered that
used engine or other feature is not supported by Galera.
Because TOI is executed parallelly in all nodes appliers
could execute given TOI and ignore the error and
start inconsistency voting causing node to leave from
cluster or we might have a crash as reported.
For example SEQUENCE engine does not support GEOMETRY data
type causing either inconsistency between nodes (because
some errors are ignored on applier) or crash.
Fixed my adding new function wsrep_check_support to check
can Galera support provided CREATE TABLE/SEQUENCE before TOI is
started and if not clear error message is provided to
the user.
Currently, not supported cases:
* CREATE TABLE ... AS SELECT when streaming replication is used
* CREATE TABLE ... WITH SYSTEM VERSIONING AS SELECT
* CREATE TABLE ... ENGINE=SEQUENCE
* CREATE SEQUENCE ... ENGINE!=InnoDB
* ALTER TABLE t ... ENGINE!=InnoDB where table t is SEQUENCE
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
The bitmap is temporarily flipped to ~0 for the sake of checking all
fields. It needs to be restored because it will be reused in second
and subsequent ps execution.
While cleaning up a failed CREATE TABLE LIKE <sequence>, `mysql_rm_table_no_locks`
erroneously attempted to remove all tables involved in the query, including
the source table (sequence).
Fix to temporarily modify `table_list` to ensure that only the intended
table is removed during the cleanup.
MDEV-31503 ALTER SEQUENCE ends up in optimistic parallel slave binlog out-of-order
The OOO error still was possible even after MDEV-31077. This time
it occured through open_table() when the sequence table was not in
the table cache *and* the table was created before the last server
restart.
In such context a internal (read-only) transaction is committed
and it was not blocked from doing a wakeup() call to subsequent
transactions.
Fixed with extending suspend_subsequent_commits() effect for the entirety
of Sql_cmd_alter_sequence::execute().
An elaborated MDEV-31077 test proves the fixes of both failure scenarios.
Also the bug condition suggests a workaround to pre-SELECT sequence
tables before START SLAVE.
Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
Problem for Galera is the fact that sequences are not really
transactional. Sequence operation is committed immediately
in sql_sequence.cd and later Galera could find out that
we have changes but actual statement is not there anymore.
Therefore, we must make some restrictions what kind
of sequences Galera can support.
(1) Galera cluster supports only sequences implemented
by InnoDB storage engine. This is because Galera replication
supports currently only InnoDB.
(2) We do not allow LOCK TABLE on sequence object and
we do not allow sequence creation under LOCK TABLE, instead
lock is released and we issue warning.
(3) We allow sequences with NOCACHE definition or with
INCREMEMENT BY 0 CACHE=n definition. This makes sure that
sequence values are unique accross Galera cluster.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Making changes to wsrep_mysqld.h causes large parts of server code to
be recompiled. The reason is that wsrep_mysqld.h is included by
sql_class.h, even tough very little of wsrep_mysqld.h is needed in
sql_class.h. This commit introduces a new header file, wsrep_on.h,
which is meant to be included from sql_class.h, and contains only
macros and variable declarations used to determine whether wsrep is
enabled.
Also, header wsrep.h should only contain definitions that are also
used outside of sql/. Therefore, move WSREP_TO_ISOLATION* and
WSREP_SYNC_WAIT macros to wsrep_mysqld.h.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
Problem:
========
When using sequences, the function
sequence_definition::write(TABLE *table, bool all_fields)
is used to save DML/DDL updates to sequence tables (e.g. nextval,
setval, and alter). Prior to this patch, the value all_fields was
always false when invoked via nextval and setval, which forced the
bitmap to only include changed columns.
Solution:
========
Change all_fields when invoked via nextval and setval to be reliant
on binlog_row_image, such that it is false when binlog_row_image is
MINIMAL, and true otherwise.
Reviewed By:
===========
Andrei Elkin <andrei.elkin@mariadb.com>
Sequence storage engine is not transactionl so cache will be written in
stmt_cache that is not replicated in cluster. To fix this replicate
what is available in both trans_cache and stmt_cache.
Sequences will only work when NOCACHE keyword is used when sequnce is
created. If WSREP is enabled and we don't have this keyword report error
indicting that sequence will not work correctly in cluster.
When binlog is enabled statement cache will be cleared in transaction
before COMMIT so cache generated from sequence will not be replicated.
We need to keep cache until replication.
Tests are re-recorded because of replication changes that were
introducted with this PR.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
CREATE-OR-REPLACE SEQUENCE is not logged with Gtid event DDL flag
which affects its slave parallel execution.
Unlike other DDL:s it can occur in concurrent execution with following transactions
which can lead to various errors, including asserts like
(mdl_request->type != MDL_INTENTION_EXCLUSIVE && mdl_request->type != MDL_EXCLUSIVE) || !(get_thd()->rgi_slave && get_thd()->rgi_slave->is_parallel_exec && lock->check_if_conflicting_replication_locks(this)
in MDL_context::acquire_lock.
Fixed to wrap internal statement level commit with save-
and-restore of TRANS_THD::m_unsafe_rollback_flags.
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.
The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
memory access of integers. Fixed by using byte_order_generic.h when
compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
safe to have overflows (two cases, in item_func.cc).
Things fixed:
- Don't left shift signed values
(byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
constructors. This was needed as UBSAN checks that these types has
correct values when one copies an object.
(gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
deleted objects.
(events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
on Query_arena object.
- Fixed several cast of objects to an incompatible class!
(Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
This includes also ++ and -- of integers.
(Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
value_type is initialized to this instead of to -1, which is not a valid
enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
instead of a null string (safer as it ensures we do not do arithmetic
on null strings).
Other things:
- Changed struct st_position to an OBJECT and added an initialization
function to it to ensure that we do not copy or use uninitialized
members. The change to a class was also motived that we used "struct
st_position" and POSITION randomly trough the code which was
confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr. (This variable was before
only in 10.5 and up). It can now have one of two values:
ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
it virtual. This was an effort to get UBSAN to work with loaded storage
engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
to integer arithmetic.
Changes that should not be needed but had to be done to suppress warnings
from UBSAN:
- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
some compile time warnings.
Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
The assertion failed in handler::ha_reset upon SELECT under
READ UNCOMMITTED from table with index on virtual column.
This was the debug-only failure, though the problem is mush wider:
* MY_BITMAP is a structure containing my_bitmap_map, the latter is a raw
bitmap.
* read_set, write_set and vcol_set of TABLE are the pointers to MY_BITMAP
* The rest of MY_BITMAPs are stored in TABLE and TABLE_SHARE
* The pointers to the stored MY_BITMAPs, like orig_read_set etc, and
sometimes all_set and tmp_set, are assigned to the pointers.
* Sometimes tmp_use_all_columns is used to substitute the raw bitmap
directly with all_set.bitmap
* Sometimes even bitmaps are directly modified, like in
TABLE::update_virtual_field(): bitmap_clear_all(&tmp_set) is called.
The last three bullets in the list, when used together (which is mostly
always) make the program flow cumbersome and impossible to follow,
notwithstanding the errors they cause, like this MDEV-17556, where tmp_set
pointer was assigned to read_set, write_set and vcol_set, then its bitmap
was substituted with all_set.bitmap by dbug_tmp_use_all_columns() call,
and then bitmap_clear_all(&tmp_set) was applied to all this.
To untangle this knot, the rule should be applied:
* Never substitute bitmaps! This patch is about this.
orig_*, all_set bitmaps are never substituted already.
This patch changes the following function prototypes:
* tmp_use_all_columns, dbug_tmp_use_all_columns
to accept MY_BITMAP** and to return MY_BITMAP * instead of my_bitmap_map*
* tmp_restore_column_map, dbug_tmp_restore_column_maps to accept
MY_BITMAP* instead of my_bitmap_map*
These functions now will substitute read_set/write_set/vcol_set directly,
and won't touch underlying bitmaps.
Problem was that FLUSH TABLES where trying to read latest sequence state
which conflicted with a running ALTER SEQUENCE. Removed the reading
of the state, when opening a table for FLUSH, as it's not needed in this
case.
Other thing:
- Fixed a potential issue with concurrently running ALTER SEQUENCE where
the later ALTER could potentially read old data
MDEV-22531 Remove maria::implicit_commit()
MDEV-22607 Assertion `ha_info->ht() != binlog_hton' failed in
MYSQL_BIN_LOG::unlog_xa_prepare
From the handler point of view, Aria now looks like a transactional
engine. One effect of this is that we don't need to call
maria::implicit_commit() anymore.
This change also forces the server to call trans_commit_stmt() after doing
any read or writes to system tables. This work will also make it easier
to later allow users to have system tables in other engines than Aria.
To handle the case that Aria doesn't support rollback, a new
handlerton flag, HTON_NO_ROLLBACK, was added to engines that has
transactions without rollback (for the moment only binlog and Aria).
Other things
- Moved freeing of MARIA_SHARE to a separate function as the MARIA_SHARE
can be still part of a transaction even if the table has closed.
- Changed Aria checkpoint to use the new MARIA_SHARE free function. This
fixes a possible memory leak when using S3 tables
- Changed testing of binlog_hton to instead test for HTON_NO_ROLLBACK
- Removed checking of has_transaction_manager() in handler.cc as we can
assume that as the transaction was started by the engine, it does
support transactions.
- Added new class 'start_new_trans' that can be used to start indepdendent
sub transactions, for example while reading mysql.proc, using help or
status tables etc.
- open_system_tables...() and open_proc_table_for_Read() doesn't anymore
take a Open_tables_backup list. This is now handled by 'start_new_trans'.
- Split thd::has_transactions() to thd::has_transactions() and
thd::has_transactions_and_rollback()
- Added handlerton code to free cached transactions objects.
Needed by InnoDB.
squash! 2ed35999f2a2d84f1c786a21ade5db716b6f1bbc
All changes (except one) is of type
thd->transaction. -> thd->transaction->
thd->transaction points by default to 'thd->default_transaction'
This allows us to 'easily' have multiple active transactions for a
THD object, like when reading data from the mysql.proc table