In a rebase of the merge, two preceding commits were accidentally reverted:
commit 112b23969a (MDEV-26308)
commit ac2857a5fb (MDEV-25717)
Thanks to Daniele Sciascia for noticing this.
If cluster is bootstrapped in existing database, we should use provided
configuration variables for wsrep_gtid_domain_id and server_id instead
of recovered ones.
If 'new' combination of wsrep_gtid_domain_id & server_id already existed
somewere before in binlog we should continue from last seqno, if
combination is new we start from seqno 0.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
Contains following fixes:
* allow TOI commands to timeout while trying to acquire TOI with
override lock_wait_timeout with a LONG_TIMEOUT only after
succesfully entering TOI
* only ignore lock_wait_timeout on TOI
* fix galera_split_brain test as TOI operation now returns ER_LOCK_WAIT_TIMEOUT after lock_wait_timeout
* explicitly test for TOI
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
make BACKUP STAGE behave as FTWRL, desyncing and pausing the node
to prevent BF threads (appliers) from interfering with blocking stages.
This is needed because BF threads don't respect BACKUP MDL locks.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
because the name was misleading, it counts not threads, but THDs,
and as THD_count is the only way to increment/decrement it, it
could as well be declared inside THD_count.
If temporary internal table is in use `hton` will not be set. Skip check
if DDL should be replicated in this case.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
Changes:
- To detect automatic strlen() I removed the methods in String that
uses 'const char *' without a length:
- String::append(const char*)
- Binary_string(const char *str)
- String(const char *str, CHARSET_INFO *cs)
- append_for_single_quote(const char *)
All usage of append(const char*) is changed to either use
String::append(char), String::append(const char*, size_t length) or
String::append(LEX_CSTRING)
- Added STRING_WITH_LEN() around constant string arguments to
String::append()
- Added overflow argument to escape_string_for_mysql() and
escape_quotes_for_mysql() instead of returning (size_t) -1 on overflow.
This was needed as most usage of the above functions never tested the
result for -1 and would have given wrong results or crashes in case
of overflows.
- Added Item_func_or_sum::func_name_cstring(), which returns LEX_CSTRING.
Changed all Item_func::func_name()'s to func_name_cstring()'s.
The old Item_func_or_sum::func_name() is now an inline function that
returns func_name_cstring().str.
- Changed Item::mode_name() and Item::func_name_ext() to return
LEX_CSTRING.
- Changed for some functions the name argument from const char * to
to const LEX_CSTRING &:
- Item::Item_func_fix_attributes()
- Item::check_type_...()
- Type_std_attributes::agg_item_collations()
- Type_std_attributes::agg_item_set_converter()
- Type_std_attributes::agg_arg_charsets...()
- Type_handler_hybrid_field_type::aggregate_for_result()
- Type_handler_geometry::check_type_geom_or_binary()
- Type_handler::Item_func_or_sum_illegal_param()
- Predicant_to_list_comparator::add_value_skip_null()
- Predicant_to_list_comparator::add_value()
- cmp_item_row::prepare_comparators()
- cmp_item_row::aggregate_row_elements_for_comparison()
- Cursor_ref::print_func()
- Removes String_space() as it was only used in one cases and that
could be simplified to not use String_space(), thanks to the fixed
my_vsnprintf().
- Added some const LEX_CSTRING's for common strings:
- NULL_clex_str, DATA_clex_str, INDEX_clex_str.
- Changed primary_key_name to a LEX_CSTRING
- Renamed String::set_quick() to String::set_buffer_if_not_allocated() to
clarify what the function really does.
- Rename of protocol function:
bool store(const char *from, CHARSET_INFO *cs) to
bool store_string_or_null(const char *from, CHARSET_INFO *cs).
This was done to both clarify the difference between this 'store' function
and also to make it easier to find unoptimal usage of store() calls.
- Added Protocol::store(const LEX_CSTRING*, CHARSET_INFO*)
- Changed some 'const char*' arrays to instead be of type LEX_CSTRING.
- class Item_func_units now used LEX_CSTRING for name.
Other things:
- Fixed a bug in mysql.cc:construct_prompt() where a wrong escape character
in the prompt would cause some part of the prompt to be duplicated.
- Fixed a lot of instances where the length of the argument to
append is known or easily obtain but was not used.
- Removed some not needed 'virtual' definition for functions that was
inherited from the parent. I added override to these.
- Fixed Ordered_key::print() to preallocate needed buffer. Old code could
case memory overruns.
- Simplified some loops when adding char * to a String with delimiters.
Added DDL logging to applier and replaying also so that
DDL is logged on other than originating node.
wsrep.h
Removed wsrep_thd_is_local conditions and cleaned up
the macros. Removed WSREP_TO_ISOLATION_END.
Event_job_data::execute
change_password
acl_set_default_role
mysql_execute_command
Replaced macro by function call
wsrep_to_isolation_begin
wsrep_to_isolation_end
If execution is not local log DDL-information when
wsrep_debug is enabled
No new tests required as current regression setting is
already testing these code paths.
Removed redundant code for BF abort transaction in `thr_lock.cc`.
TOI operations will ignore provided lock_wait_timeout and use `LONG_TIMEOUT`
until operation is finished.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
Trigger `socket.ssl_reload` when FLUSH SSL is issued. To triger reloading
of certificate, key and CA, files needs to be physically changed.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
This patch makes the following changes around variable wsrep_on:
1) Variable wsrep_on can no longer be updated from a session that has
an active transaction running. The original behavior allowed cases
like this:
BEGIN;
INSERT INTO t1 VALUES (1);
SET SESSION wsrep_on = OFF;
INSERT INTO t1 VALUES (2);
COMMIT;
With regular transactions this would result in no replication
events (not even value 1). With streaming replication it would be
unnecessarily complex to achieve the same behavior. In the above
example, it would be possible for value 1 to be already replicated if
it happened to fill a separate fragment, while value 2 wouldn't.
2) Global variable wsrep_on no longer affects current sessions, only
subsequent ones. This is to avoid a similar case to the above, just
using just by using global wsrep_on instead session wsrep_on:
--connection conn_1
BEGIN;
INSERT INTO t1 VALUES(1);
--connection conn_2
SET GLOBAL wsrep_on = OFF;
--connection conn_1
INSERT INTO t1 VALUES(2);
COMMIT;
The above example results in the transaction to be replicated, as
global wsrep_on will only affect the session wsrep_on of new
connections.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
Add condition on trx->state == TRX_STATE_COMMITTED_IN_MEMORY in order to
avoid unnecessary work. If a transaction has already been committed or
rolled back, it will release its locks in lock_release() and let
the waiting thread(s) continue execution.
Let BF wait on lock_rec_has_to_wait and if necessary other BF
is replayed.
wsrep_trx_order_before
If BF is not even replicated yet then they are ordered
correctly.
bg_wsrep_kill_trx
Make sure victim_trx is found and check also its state. If
state is TRX_STATE_COMMITTED_IN_MEMORY transaction is
already committed or rolled back and will release it locks
soon.
wsrep_assert_no_bf_bf_wait
Transaction requesting new record lock should be TRX_STATE_ACTIVE
Conflicting transaction can be in states TRX_STATE_ACTIVE,
TRX_STATE_COMMITTED_IN_MEMORY or in TRX_STATE_PREPARED.
If conflicting transaction is already committed in memory or
prepared we should wait. When transaction is committed in memory we
held trx mutex, but not lock_sys->mutex. Therefore, we
could end here before transaction has time to do lock_release()
that is protected with lock_sys->mutex.
lock_rec_has_to_wait
We very well can let bf to wait normally as other BF will be
replayed in case of conflict. For debug builds we will do
additional sanity checks to catch unsupported bf wait if any.
wsrep_kill_victim
Check is victim already in TRX_STATE_COMMITTED_IN_MEMORY state and
if it is we can return.
lock_rec_dequeue_from_page
lock_rec_unlock
Remove unnecessary wsrep_assert_no_bf_bf_wait function calls.
We can very well let BF wait here.
Added a new wsrep_mode feature DISALLOW_LOCAL_GTID for this.
Nodes can have GTIDs for local transactions in the following scenarios:
A DDL statement is executed with wsrep_OSU_method=RSU set.
A DML statement writes to a non-InnoDB table.
A DML statement writes to an InnoDB table with wsrep_on=OFF set.
If user has set wsrep_mode=DISALLOW_LOCAL_GTID these operations
produce a error ERROR HY000: Galera replication not supported
Introduced two new wsrep_mode options
* REPLICATE_MYISAM
* REPLICATE_ARIA
Depracated wsrep_replicate_myisam parameter and we use
wsrep_mode = REPLICATE_MYISAM instead.
This required small refactoring of wsrep_check_mode_after_open_table
so that both MyISAM and Aria are handled on required DML cases.
Similarly, added Aria to wsrep_should_replicate_ddl to handle DDL
for Aria tables using TOI. Added test cases and improved MyISAM testing.
Changed use of wsrep_replicate_myisam to wsrep_mode = REPLICATE_MYISAM
Two new features for Galera
* Write a warning to error log if Galera replicates table with storage engine not supported by Galera (at the moment only InnoDB is supported
** Warning is pushed to client also
** MyISAM is allowed if wsrep_replicate_myisam=ON
* Write a warning to error log if Galera replicates table with no primary key
** Warning is pushed to client also
** MyISAM is allowed if wsrep_relicate_myisam=ON
* In both cases apply flood control if > 10 same warning is writen to error log
(requires log_warnings > 1), flood control will suppress warnings for 300 seconds
For truncate we try to find out possible foreign key tables
using open_tables. However, table_list was not cleaned up
properly and there was no error handling. Fixed by cleaning
table_list and adding proper error handling.
wsrep_cluster_address_update() causes LOCK_wsrep_slave_threads
to be locked under LOCK_wsrep_cluster_config, while normally
the order should be the opposite.
Fix: don't protect @@wsrep_cluster_address value with the
LOCK_wsrep_cluster_config, LOCK_global_system_variables is enough.
Only protect wsrep reinitialization with the LOCK_wsrep_cluster_config.
And make it use a local copy of the global @@wsrep_cluster_address.
Also, introduce a helper function that checks whether
wsrep_cluster_address is set and also asserts that it can be safely
read by the caller.
Problem was that when engine substitution is allowd (e.g. sql_mode='')
we must also check db_type. Additionally, we did not resolve
default storage engine on that case and used that to check is
TOI possible or not.
* reuse the loop in THD::abort_current_cond_wait, don't duplicate it
* find_thread_by_id should return whatever it has found, it's the
caller's task not to kill COM_DAEMON (if the caller's a killer)
and other minor changes
Since 2017 (c2118a08b1) THD::awake() no longer requires LOCK_thd_data.
It uses LOCK_thd_kill, and this latter mutex is used to prevent
a thread of dying, not LOCK_thd_data as before.
Added new enum variable `wsrep_mode` which can be used to turn on WSREP
features which are not part of default behaviour.
Added enum `BINLOG_ROW_FORMAT_ONLY`, `REQUIRED_PRIMARY_KEY` and
`STRICT_REPLICATION`. `wsrep-mode=STRICT_REPLICATION` behaves
like variable `wsrep_strict_ddl`.
Variable wsrep_strict_ddl is deprecated and if set we use
new wsrep_mode setting instead.
Reviewed and improved by: Jan Lindström <jan.lindstrom@mariadb.com>
mutex order violation here.
when wsrep bf thread kills a conflicting trx, the stack is
wsrep_thd_LOCK()
wsrep_kill_victim()
lock_rec_other_has_conflicting()
lock_clust_rec_read_check_and_lock()
row_search_mvcc()
ha_innobase::index_read()
ha_innobase::rnd_pos()
handler::ha_rnd_pos()
handler::rnd_pos_by_record()
handler::ha_rnd_pos_by_record()
Rows_log_event::find_row()
Update_rows_log_event::do_exec_row()
Rows_log_event::do_apply_event()
Log_event::apply_event()
wsrep_apply_events()
and mutexes are taken in the order
lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data
When a normal KILL statement is executed, the stack is
innobase_kill_query()
kill_handlerton()
plugin_foreach_with_mask()
ha_kill_query()
THD::awake()
kill_one_thread()
and mutexes are
victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex
To fix the mutex order violation we kill the victim thd asynchronously,
from the manager thread
Galera parameter wsrep_gtid_domain_id was defined using a class where
actual parameter was not a first member. Fixed this by using normal
variable and assigning this value to class member value.