ALTER TABLE .. RENAME, when used with the inplace algorithm, does:
- Do an inplace or online alter to the new definition
- Rename to new name
- Update triggers.
If update triggers would fail, we would rename the table back.
The problem with this approach is that the table would have the new
definition but the rename would fail. The binary log would also not be
updated.
The solution to this is to very early check if we can rename triggers
and give an error if this would fail.
Both ALTER TABLE ... RENAME and RENAME TABLE is fixed.
This was implemented by moving the pre-check of rename table in triggers
from Table_triggers_list::change_table_name() to
Table_triggers_list::prepare_for_rename().
Before this fix, one would get a 'Trigger ... already exists' when trying
to create a trigger matching the original name and 'Trigger ... does not
exists" when trying to drop it.
Fixes a reported bug in MDEV-25180 Atomic ALTER TABLE
MDEV-25517 Atomic DDL: Assertion `query_arg' in THD::binlog_query
upon DROP TRIGGER
The bug was that the stmt_query variable was not populated
with the query in case of DROP TRIGGER of an orphan trigger
(.TRN file exists & table exists, but the trigger was not in
table->triggers).
The purpose of this task is to ensure that CREATE TRIGGER is atomic
When a trigger is created, we first create a trigger_name.TRN file and then
create or update the table_name.TRG files.
This is done by creating .TRN~ and .TRG~ files and replacing (or creating)
the result files.
The new logic is
- Log CREATE TRIGGER to DDL log, with a marker if old trigger existsted
- If old .TRN or .TRG files exists, make backup copies of these
- Create the new .TRN and .TRG files as before
- Remove the backups
Crash recovery
- If query has been logged to binary log:
- delete any left over backup files
- else
- Delete any old .TRN~ or .TRG~ files
- If there was orignally some triggers (old .TRG file existed)
- If we crashed before creating all backup files
- Delete existing backup files
- else
- Restore backup files
- end
- Delete .TRN and .TRG file (as there was no triggers before
One benefit of the new code is that CREATE OR REPLACE TRIGGER is now
totally atomic even if there existed an old trigger: Either the old
trigger will be replaced or the old one will be left untouched.
Other things:
- If sql_create_definition_file() would fail, there could be memory leaks
in CREATE TRIGGER, DROP TRIGGER or CREATE OR REPLACE TRIGGER. This
is now fixed.
The purpose of this task is to ensure that DROP TRIGGER is atomic.
Description of how atomic drop trigger works:
Logging of DROP TRIGGER
Log the following information:
db
table name
trigger name
xid /* Used to check if query was already logged to binary log */
initial length of the .TRG file
query if there is space for it, if not log a zero length query.
Recovery operations:
- Delete if exists 'database/trigger_name.TRN~'
- If this file existed, it means that we crashed before the trigger
was deleted and there is nothing else to do.
- Get length of .TRG file
- If file length is unchanged, trigger was not dropped. Nothing else to
do.
- Log original query to binary, if it was stored in the ddl log. If it was
not stored (long query string), log the following query to binary log:
use `database` ; DROP TRIGGER IF EXISTS `trigger_name`
/* generated by ddl log */;
Other things:
- Added trigger name and DDL_LOG_STATE to drop_trigger()
Trigger name was added to make the interface more consistent and
more general.
- Major rewrite of ddl_log.cc and ddl_log.h
- ddl_log.cc described in the beginning how the recovery works.
- ddl_log.log has unique signature and is dynamic. It's easy to
add more information to the header and other ddl blocks while still
being able to execute old ddl entries.
- IO_SIZE for ddl blocks is now dynamic. Can be changed without affecting
recovery of old logs.
- Code is more modular and is now usable outside of partition handling.
- Renamed log file to dll_recovery.log and added option --log-ddl-recovery
to allow one to specify the path & filename.
- Added ddl_log_entry_phase[], number of phases for each DDL action,
which allowed me to greatly simply set_global_from_ddl_log_entry()
- Changed how strings are stored in log entries, which allows us to
store much more information in a log entry.
- ddl log is now always created at start and deleted on normal shutdown.
This simplices things notable.
- Added probes debug_crash_here() and debug_simulate_error() to simply
crash testing and allow crash after a given number of times a probe
is executed. See comments in debug_sync.cc and rename_table.test for
how this can be used.
- Reverting failed table and view renames is done trough the ddl log.
This ensures that the ddl log is tested also outside of recovery.
- Added helper function 'handler::needs_lower_case_filenames()'
- Extend binary log with Q_XID events. ddl log handling is using this
to check if a ddl log entry was logged to the binary log (if yes,
it will be deleted from the log during ddl_log_close_binlogged_events()
- If a DDL entry fails 3 time, disable it. This is to ensure that if
we have a crash in ddl recovery code the server will not get stuck
in a forever crash-restart-crash loop.
mysqltest.cc changes:
- --die will now replace $variables with their values
- $error will contain the error of the last failed statement
storage engine changes:
- maria_rename() was changed to be more robust against crashes during
rename.
This change removed 68 explict strlen() calls from the code.
The following renames was done to ensure we don't use the old names
when merging code from earlier releases, as using the new variables
for print function could result in crashes:
- charset->csname renamed to charset->cs_name
- charset->name renamed to charset->coll_name
Almost everything where mechanical changes except:
- Changed to use the new Protocol::store(LEX_CSTRING..) when possible
- Changed to use field->store(LEX_CSTRING*, CHARSET_INFO*) when possible
- Changed to use String->append(LEX_CSTRING&) when possible
Other things:
- There where compiler issues with ensuring that all character set names
points to the same string: gcc doesn't allow one to use integer constants
when defining global structures (constant char * pointers works fine).
To get around this, I declared defines for each character set name
length.
- Changed order of class fields to remove dead alignment space.
- Changed bool fields in Item to bit fields.
- Used packed enum's for some fields in common classes
- Removed not used Item::rsize.
- Changed some class variables from uint/int to smaller type int's.
- Ensured that field_index is uint16 in all classes and functions. Fixed
also that we proparly compare with NO_CACHED_FIELD_INDEX when checking
if variable is not set.
- Removed checking of highest bit of unireg_check (has not been used in
a long time)
- Fixed wrong arguments to make_cond_for_table() for join_tab_idx_arg
from false to 0.
One of the result was reducing the size if class Item with ~24 bytes
This patch changes the main name of 3 byte character set from utf8 to
utf8mb3. New old_mode UTF8_IS_UTF8MB3 is added and set TRUE by default,
so that utf8 would mean utf8mb3. If not set, utf8 would mean utf8mb4.
The problem was that in a timeout event,
thd->lex->restore_backup_query_tables_list() was called when it should
not have been.
Patch tested with the script in MDEV-25651 (not suitable for mtr)
The bug is that we don't have a a lock on the trigger name, so it is
possible for two threads to try to create the same trigger at the same
time and both thinks that they have succeed.
Same thing can happen with drop trigger or a combinations of create and
drop trigger.
Fixed by adding a mdl lock for the trigger name for the duration of the
create/drop.
Patch tested by Elena
Use < TL_FIRST_WRITE for determining a READ transaction.
Use TL_FIRST_WRITE as the relative operator replacing TL_WRITE_ALLOW_WRITE
as the minimium WRITE lock type.
Added new enum variable `wsrep_mode` which can be used to turn on WSREP
features which are not part of default behaviour.
Added enum `BINLOG_ROW_FORMAT_ONLY`, `REQUIRED_PRIMARY_KEY` and
`STRICT_REPLICATION`. `wsrep-mode=STRICT_REPLICATION` behaves
like variable `wsrep_strict_ddl`.
Variable wsrep_strict_ddl is deprecated and if set we use
new wsrep_mode setting instead.
Reviewed and improved by: Jan Lindström <jan.lindstrom@mariadb.com>
The used code is largely based on code from Tencent
The problem is that in some rare cases there may be a conflict between .frm
files and the files in the storage engine. In this case the DROP TABLE
was not able to properly drop the table.
Some MariaDB/MySQL forks has solved this by adding a FORCE option to
DROP TABLE. After some discussion among MariaDB developers, we concluded
that users expects that DROP TABLE should always work, even if the
table would not be consistent. There should not be a need to use a
separate keyword to ensure that the table is really deleted.
The used solution is:
- If a .frm table doesn't exists, try dropping the table from all storage
engines.
- If the .frm table exists but the table does not exist in the engine
try dropping the table from all storage engines.
- Update storage engines using many table files (.CVS, MyISAM, Aria) to
succeed with the drop even if some of the files are missing.
- Add HTON_AUTOMATIC_DELETE_TABLE to handlerton's where delete_table()
is not needed and always succeed. This is used by ha_delete_table_force()
to know which handlers to ignore when trying to drop a table without
a .frm file.
The disadvantage of this solution is that a DROP TABLE on a non existing
table will be a bit slower as we have to ask all active storage engines
if they know anything about the table.
Other things:
- Added a new flag MY_IGNORE_ENOENT to my_delete() to not give an error
if the file doesn't exist. This simplifies some of the code.
- Don't clear thd->error in ha_delete_table() if there was an active
error. This is a bug fix.
- handler::delete_table() will not abort if first file doesn't exists.
This is bug fix to handle the case when a drop table was aborted in
the middle.
- Cleaned up mysql_rm_table_no_locks() to ensure that if_exists uses
same code path as when it's not used.
- Use non_existing_Table_error() to detect if table didn't exists.
Old code used different errors tests in different position.
- Table_triggers_list::drop_all_triggers() now drops trigger file if
it can't be parsed instead of leaving it hanging around (bug fix)
- InnoDB doesn't anymore print error about .frm file out of sync with
InnoDB directory if .frm file does not exists. This change was required
to be able to try to drop an InnoDB file when .frm doesn't exists.
- Fixed bug in mi_delete_table() where the .MYD file would not be dropped
if the .MYI file didn't exists.
- Fixed memory leak in Mroonga when deleting non existing table
- Fixed memory leak in Connect when deleting non existing table
Bugs fixed introduced by the original version of this commit:
MDEV-22826 Presence of Spider prevents tables from being force-deleted from
other engines
Backported the support for aborting and replaying stored procedure and fix for trigger
key assigments from 10.4 version.
Backported also two mtr tests: wsrep_sp_bf_abort and MDEV-20225
MDEV-22531 Remove maria::implicit_commit()
MDEV-22607 Assertion `ha_info->ht() != binlog_hton' failed in
MYSQL_BIN_LOG::unlog_xa_prepare
From the handler point of view, Aria now looks like a transactional
engine. One effect of this is that we don't need to call
maria::implicit_commit() anymore.
This change also forces the server to call trans_commit_stmt() after doing
any read or writes to system tables. This work will also make it easier
to later allow users to have system tables in other engines than Aria.
To handle the case that Aria doesn't support rollback, a new
handlerton flag, HTON_NO_ROLLBACK, was added to engines that has
transactions without rollback (for the moment only binlog and Aria).
Other things
- Moved freeing of MARIA_SHARE to a separate function as the MARIA_SHARE
can be still part of a transaction even if the table has closed.
- Changed Aria checkpoint to use the new MARIA_SHARE free function. This
fixes a possible memory leak when using S3 tables
- Changed testing of binlog_hton to instead test for HTON_NO_ROLLBACK
- Removed checking of has_transaction_manager() in handler.cc as we can
assume that as the transaction was started by the engine, it does
support transactions.
- Added new class 'start_new_trans' that can be used to start indepdendent
sub transactions, for example while reading mysql.proc, using help or
status tables etc.
- open_system_tables...() and open_proc_table_for_Read() doesn't anymore
take a Open_tables_backup list. This is now handled by 'start_new_trans'.
- Split thd::has_transactions() to thd::has_transactions() and
thd::has_transactions_and_rollback()
- Added handlerton code to free cached transactions objects.
Needed by InnoDB.
squash! 2ed35999f2a2d84f1c786a21ade5db716b6f1bbc
MDEV-21605 Clean up and speed up interfaces for binary row logging
MDEV-21617 Bug fix for previous version of this code
The intention is to have as few 'if' as possible in ha_write() and
related functions. This is done by pre-calculating once per statement the
row_logging state for all tables.
Benefits are simpler and faster code both when binary logging is disabled
and when it's enabled.
Changes:
- Added handler->row_logging to make it easy to check it table should be
row logged. This also made it easier to disabling row logging for system,
internal and temporary tables.
- The tables row_logging capabilities are checked once per "statements
that updates tables" in THD::binlog_prepare_for_row_logging() which
is called when needed from THD::decide_logging_format().
- Removed most usage of tmp_disable_binlog(), reenable_binlog() and
temporary saving and setting of thd->variables.option_bits.
- Moved checks that can't change during a statement from
check_table_binlog_row_based() to check_table_binlog_row_based_internal()
- Removed flag row_already_logged (used by sequence engine)
- Moved binlog_log_row() to a handler::
- Moved write_locked_table_maps() to THD::binlog_write_table_maps() as
most other related binlog functions are in THD.
- Removed binlog_write_table_map() and binlog_log_row_internal() as
they are now obsolete as 'has_transactions()' is pre-calculated in
prepare_for_row_logging().
- Remove 'is_transactional' argument from binlog_write_table_map() as this
can now be read from handler.
- Changed order of 'if's in handler::external_lock() and wsrep_mysqld.h
to first evaluate fast and likely cases before more complex ones.
- Added error checking in ha_write_row() and related functions if
binlog_log_row() failed.
- Don't clear check_table_binlog_row_based_result in
clear_cached_table_binlog_row_based_flag() as it's not needed.
- THD::clear_binlog_table_maps() has been replaced with
THD::reset_binlog_for_next_statement()
- Added 'MYSQL_OPEN_IGNORE_LOGGING_FORMAT' flag to open_and_lock_tables()
to avoid calculating of binary log format for internal opens. This flag
is also used to avoid reading statistics tables for internal tables.
- Added OPTION_BINLOG_LOG_OFF as a simple way to turn of binlog temporary
for create (instead of using THD::sql_log_bin_off.
- Removed flag THD::sql_log_bin_off (not needed anymore)
- Speed up THD::decide_logging_format() by remembering if blackhole engine
is used and avoid a loop over all tables if it's not used
(the common case).
- THD::decide_logging_format() is not called anymore if no tables are used
for the statement. This will speed up pure stored procedure code with
about 5%+ according to some simple tests.
- We now get annotated events on slave if a CREATE ... SELECT statement
is transformed on the slave from statement to row logging.
- In the original code, the master could come into a state where row
logging is enforced for all future events if statement could be used.
This is now partly fixed.
Other changes:
- Ensure that all tables used by a statement has query_id set.
- Had to restore the row_logging flag for not used tables in
THD::binlog_write_table_maps (not normal scenario)
- Removed injector::transaction::use_table(server_id_type sid, table tbl)
as it's not used.
- Cleaned up set_slave_thread_options()
- Some more DBUG_ENTER/DBUG_RETURN, code comments and minor indentation
changes.
- Ensure we only call THD::decide_logging_format_low() once in
mysql_insert() (inefficiency).
- Don't annotate INSERT DELAYED
- Removed zeroing pos_in_table_list in THD::open_temporary_table() as it's
already 0
Introduced a new wsrep_strict_ddl configuration variable in which
Galera checks storage engine of the effected table. If table is not
InnoDB (only storage engine currently fully supporting Galera
replication) DDL-statement will return error code:
ER_GALERA_REPLICATION_NOT_SUPPORTED
eng "DDL-statement is forbidden as table storage engine does not support Galera replication"
However, when wsrep_replicate_myisam=ON we allow DDL-statements to
MyISAM tables. If effected table is allowed storage engine Galera
will run normal TOI.
This new setting should be for now set globally on all
nodes in a cluster. When this setting is set following DDL-clauses
accessing tables not supporting Galera replication are refused:
* CREATE TABLE (e.g. CREATE TABLE t1(a int) engine=Aria
* ALTER TABLE
* TRUNCATE TABLE
* CREATE VIEW
* CREATE TRIGGER
* CREATE INDEX
* DROP INDEX
* RENAME TABLE
* DROP TABLE
Statements on PROCEDURE, EVENT, FUNCTION are allowed as effected
tables are known only at execution. Furthermore, USER, ROLE, SERVER,
DATABASE statements are also allowed as they do not really have
effected table.
(Variant #2 of the patch, which keeps the sp_head object inside the
MEM_ROOT that sp_head object owns)
(10.3 requires extra work due to sp_package, will commit a separate
patch for it)
sp_head::operator new() and operator delete() were dereferencing sp_head*
pointers to memory that didn't hold a valid sp_head object (it was
not created/already destroyed).
This caused UBSan to crash when looking up type information.
Fixed by providing static sp_head::create() and sp_head::destroy() methods.
(Variant #2 of the patch, which keeps the sp_head object inside the
MEM_ROOT that sp_head object owns)
(10.3 version of the fix, with handling for class sp_package)
sp_head::operator new() and operator delete() were dereferencing sp_head*
pointers to memory that didn't hold a valid sp_head object (it was
not created/already destroyed).
This caused UBSan to crash when looking up type information.
Fixed by providing static sp_head::create() and sp_head::destroy() methods.
* MDEV-20225 BF aborting SP execution
When stored procedure execution was chosen as victim for a BF abort, the old implemnetationn called for rollback immediately
when execution was inside SP isntruction. Technically this happened in wsrep_after_statement() call, which identified the
need for a rollback.
The problem was that MariaDB does not accept rollback (nor commit) inside sub statement, there are several asserts about it,
checking for THD::in_sub_stmt.
This patch contains a fix, which skips calling wsrep_after_statement() for SP execution, which is marked as BF must abort. Instead,
we return error code to upper level, where rollback will eventually happen, ouside of SP execution.
Also, appending the affected trigger table (dropped or created) in the populated key set for the write set,
which prevents parallel applying of other transactions working on the same table.
* MDEV-20225 BF aborting SP execution, second patch
First PR missed 4 commits, which are now squashed in this patch:
- Added galera_sp_bf_abort test.
A MTR test case which will reproduce BF-BF conflict if all keys
corresponding to affected tables are not assigned for DROP TRIGGER.
- Fixed incorrect use of sync pointsin MDEV-20225
- Added condition for SQLCOM_DROP_TRIGGER in wsrep_can_run_in_toi()
to make it replicate.
* MDEV-20225 BF aborting SP execution, third patch
The galera_trigger.test caused a situation, where SP invocation caused a trigger
to fire, and the trigger executed as sub statement SP, and was BF aborted by applier.
because of wsrep_after_statement() was called for the sub-statement level, it ended up
in exeuting rollback and asserted there.
Thus fix will catch sub-statement level SP execution, and avoids calling wsrep_after_statement()
The MDEV-17262 commit 26432e49d3
was skipped. In Galera 4, the implementation would seem to require
changes to the streaming replication.
In the tests archive.rnd_pos main.profiling, disable_ps_protocol
for SHOW STATUS and SHOW PROFILE commands until MDEV-18974
has been fixed.
There were two newly enabled warnings:
1. cast for a function pointers. Affected sql_analyse.h, mi_write.c
and ma_write.cc, mf_iocache-t.cc, mysqlbinlog.cc, encryption.cc, etc
2. memcpy/memset of nontrivial structures. Fixed as:
* the warning disabled for InnoDB
* TABLE, TABLE_SHARE, and TABLE_LIST got a new method reset() which
does the bzero(), which is safe for these classes, but any other
bzero() will still cause a warning
* Table_scope_and_contents_source_st uses `TABLE_LIST *` (trivial)
instead of `SQL_I_List<TABLE_LIST>` (not trivial) so it's safe to
bzero now.
* added casts in debug_sync.cc and sql_select.cc (for JOIN)
* move assignment method for MDL_request instead of memcpy()
* PARTIAL_INDEX_INTERSECT_INFO::init() instead of bzero()
* remove constructor from READ_RECORD() to make it trivial
* replace some memcpy() with c++ copy assignments
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.
main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.