Problem: In regular replication, when master binlogged using statement format
slave might not have written an event to its binary log when the Query
event aimed at a temporary table.
Specifically this was observed with LOAD DATA INFILE.
This effect was possible because unlike master slave holds temporary
tables in its pool and the master side check of existence of a
temporary table at the format bin-logging decision did not apply.
Solution: replace THD::has_thd_temporary_tables() with
THD::has_temporary_tables which allows to identify temporary table
presence on either side.
--
Reviewed by Andrei Elkin.
extra2_read_len resolved by keeping the implementation
in sql/table.cc by exposed it for use by ha_partition.cc
Remove identical implementation in unireg.h
(ref: bfed2c7d57)
whenever possible, partitioning should use the full
partition plugin name, not the one byte legacy code.
Normally, ha_partition can get the engine plugin from
table_share->default_part_plugin.
But in some cases, e.g. in DROP TABLE, the table isn't
opened, table_share is NULL, and ha_partition has to parse
the frm, much like dd_frm_type() does.
temporary_tables.cc, sql_table.cc:
When dropping a table, it must be deleted in the engine
first, then frm file. Because frm can be the only true
source of metadata that the engine might need for DROP.
table.cc:
when opening a partitioned table, if the engine for
partitions is not found, do not fallback to MyISAM.
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.
The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
memory access of integers. Fixed by using byte_order_generic.h when
compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
safe to have overflows (two cases, in item_func.cc).
Things fixed:
- Don't left shift signed values
(byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
constructors. This was needed as UBSAN checks that these types has
correct values when one copies an object.
(gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
deleted objects.
(events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
on Query_arena object.
- Fixed several cast of objects to an incompatible class!
(Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
This includes also ++ and -- of integers.
(Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
value_type is initialized to this instead of to -1, which is not a valid
enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
instead of a null string (safer as it ensures we do not do arithmetic
on null strings).
Other things:
- Changed struct st_position to an OBJECT and added an initialization
function to it to ensure that we do not copy or use uninitialized
members. The change to a class was also motived that we used "struct
st_position" and POSITION randomly trough the code which was
confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr. (This variable was before
only in 10.5 and up). It can now have one of two values:
ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
it virtual. This was an effort to get UBSAN to work with loaded storage
engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
to integer arithmetic.
Changes that should not be needed but had to be done to suppress warnings
from UBSAN:
- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
some compile time warnings.
Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
MDEV-21398 Deadlock (server hang) or assertion failure in
Diagnostics_area::set_error_status upon ALTER under lock
This failure could only happen if one locked the same table
multiple times and then did an ALTER TABLE on the table.
Major change is to change all instances of
table->m_needs_reopen= true;
to
table->mark_table_for_reopen();
The main fix for the problem was to ensure that we mark all
instances of the table in the locked_table_list and when we
reopen the tables, we first close all tables before reopening
and locking them.
Other things:
- Don't call thd->locked_tables_list.reopen_tables if there
are no tables marked for reopen. (performance)
When there is a WITH clause we postpone check for tables without
database for later stages when tables in WITH will be defined.
But we should not try to open such tables as temporary tables because
temporary tables always belong to a some database.
The test cases for the MDEV found several independent bugs
in MariaDB server and Aria:
- If a temporary table was marked as crashed, it could never
be deleted.
- Opening of a crashed temporary table gave an error message
but the error was never forwarded to the caller which caused
an assert() in my_ok()
- init_read_record() did mmap of all temporary tables, which is
probably not a good idea as this area can potentially be
very big. Changed code to only mmap internal temporary tables.
- mmap-ed tables where not unmapped in case of repair/optimize
which caused bad data in table and crashes if the original
table files where replaced with new ones (as the old mmap
was still in place). Fixed by removing the mmap in case
of repair.
- Cleaned up usage of code that disabled mmap in Aria
Problem:
========
The test now fails with the following trace:
CURRENT_TEST: rpl.rpl_parallel_temptable
--- /mariadb/10.4/mysql-test/suite/rpl/r/rpl_parallel_temptable.result
+++ /mariadb/10.4/mysql-test/suite/rpl/r/rpl_parallel_temptable.reject
@@ -194,7 +194,6 @@
30 conservative
31 conservative
32 optimistic
-33 optimistic
Analysis:
=========
The part of test which fails with result content mismatch is given below.
CREATE TEMPORARY TABLE t4 (a INT PRIMARY KEY) ENGINE=InnoDB;
INSERT INTO t4 VALUES (32);
INSERT INTO t4 VALUES (33);
INSERT INTO t1 SELECT a, "optimistic" FROM t4;
slave_parallel_mode=optimistic
The expectation of the above test script is, INSERT FROM SELECT should read both
32, 33 and populate table 't1'. But this expectation fails occasionally.
All three INSERT statements are handed over to three different slave parallel
workers. Temporary tables are not safe for parallel replication. They were
designed to be visible to one thread only, so have no table locking. Thus there
is no protection against two conflicting transactions committing in parallel and
things like that.
So anything that uses temporary tables will be serialized with anything before
it, when using parallel replication by using a "wait_for_prior_commit" function
call. This will ensure that the each transaction is executed sequentially.
But there exists a code path in which the above wait doesn't happen. Because of
this at times INSERT from SELECT doesn't wait for the INSERT (33) to complete
and it completes its executes and enters commit stage. Hence only row 32 is
found in those cases resulting in test failure.
The wait needs to be added within "open_temporary_table" call. The code looks
like this within "open_temporary_table".
Each thread tries to open temporary table in 3 different ways:
case 1: Find a temporary table which is already in use by using
find_temporary_table(tl) && wait_for_prior_commit()
case 2: If above failed then try to look for temporary table which is marked for
free for reuse. This internally calls "wait_for_prior_commit()" if table
is found.
find_and_use_tmp_table(tl, &table)
case 3: If none of the above open a new table handle from table share.
if (!table && (share= find_tmp_table_share(tl)))
{ table= open_temporary_table(share, tl->get_table_name(), true); }
At present the "wait_for_prior_commit" happens only in case 1 & 2.
Fix:
====
On slave add a call for "wait_for_prior_commit" for case 3.
The above wait on slave will solve the issue. A more detailed fix would be to
mark temporary tables as not safe for parallel execution on the master side.
In order to do that, on the master side, mark the Gtid_log_event specific flag
FL_TRANSACTIONAL to be false all the time. So that they are not scheduled
parallely.
truncating a temporary table
TRUNCATE expects only one TABLE instance (which is used by TRUNCATE
itself) to be open. However this requirement wasn't enforced after
"MDEV-5535: Cannot reopen temporary table".
Fixed by closing unused table instances before performing TRUNCATE.
Do not register intermediate tables created by inplace ALTER TABLE in
THD::temporary_tables.
Regular ALTER TABLE doesn't create .frm for temporary and discoverable
tables anymore. For inplace ALTER TABLE moved .frm creation to
create_table_for_inplace_alter().
Removed open_in_engine argument of create_and_open_tmp_table() and
open_temporary_table(): it became unused after this patch.
Part of MDEV-17805 - Remove InnoDB cache for temporary tables.
CREATE TEMPORARY TABLE locks SE plugin 6 times. 5 of these locks are
released by the end of the statement. And only 1 acquired by
init_from_binary_frm_image() / plugin_lock() remains.
The lock removed in this patch was clearly redundant.
Part of MDEV-17805 - Remove InnoDB cache for temporary tables.
This was caused by a combination of factors:
* MyISAM/Aria temporary tables historically never saved the state
to disk (MYI/MAI), because the state never needed to persist
* certain ALTER TABLE operations modify the original TABLE structure
and if they fail, the original table has to be reopened to
revert all changes (m_needs_reopen=1)
as a result, when ALTER fails and MyISAM/Aria temp table gets reopened,
it reads the stale state from the disk.
As a fix, MyISAM/Aria tables now *always* write the state to disk
on close, *unless* HA_EXTRA_PREPARE_FOR_DROP was done first. And
the server now always does HA_EXTRA_PREPARE_FOR_DROP before dropping
a temporary table.
THD::close_temporary_tables(): Revert the change.
ha_innobase::delete_table(): Move the work-around inside
a debug assertion, and check thd_kill_level() instead of thd_killed(),
because the latter would not hold for KILL_CONNECTION.
THD::close_temporary_tables(): Assign lex->sql_command so that
the debug assertion will not fail in ha_innobase::delete_table().
Alternatively, we could ensure that thd_killed() holds inside
ha_innobase::delete_table().
There should be no impact for the non-debug build. The thd_sql_command()
inside ha_innobase::delete_table() only affects the treatment of
persistent FOREIGN KEY metadata. There is no persistent metadata
nor foreign key constraints for temporary tables.
No test case was added, because the failure is nondeterministic.
truncating a temporary table
TRUNCATE expects only one TABLE instance (which is used by TRUNCATE
itself) to be open. However this requirement wasn't enforced after
"MDEV-5535: Cannot reopen temporary table".
Fixed by closing unused table instances before performing TRUNCATE.
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db
Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
Other changes done to get this to work:
- Added 'internal_tables' to TABLE object to list which sequence tables
is needed to use the table.
- Mark any expression using DEFAULT() with LEX->default_used.
This is needed when deciding if we should open internal sequence
tables when a table is opened (we don't need to open sequence tables
if the main table is only used with SELECT).
- Create_and_open_temporary_table() can now also open all internal
sequence tables.
- Added option MYSQL_LOCK_USE_MALLOC to mysql_lock_tables()
to force memory allocation to be used with malloc instead of
memroot.
- Added flag to MYSQL_LOCK to remember if allocation was done with
malloc or memroot (makes code simpler and safer).
- init_one_table_for_prelocking() now takes argument for what lock to
use instead of it's a routine or something else.
- Renamed prelocking placeholders to make them more understandable as
they are now used in more code.
- Changed test in check_lock_and_start_stmt() if found table has correct
locks. The old test didn't work for tables that has lock
TL_WRITE_ALLOW_WRITE, which is what sequence tables are using.
- Added VCOL_NOT_VIRTUAL option to ensure that sequence functions can't
be used with virtual columns
- More sequence tests
- Fix win64 pointer truncation warnings
(usually coming from misusing 0x%lx and long cast in DBUG)
- Also fix printf-format warnings
Make the above mentioned warnings fatal.
- fix pthread_join on Windows to set return value.
- Added TABLE_SHARE->not_usable_by_query_cache
- Moved TABLE->no_replicate to TABLE_SHARE->no_replicate as it's same for
all TABLE instances
- Renamed TABLE_SHARE->cached_row_logging_check to can_do_row_logging
- Added sql/mariadb.h file that should be included first by files in sql
directory, if sql_plugin.h is not used (sql_plugin.h adds SHOW variables
that must be done before my_global.h is included)
- Removed a lot of include my_global.h from include files
- Removed include's of some files that my_global.h automatically includes
- Removed duplicated include's of my_sys.h
- Replaced include my_config.h with my_global.h