Adding a FK constraint to an existing table (ALTER TABLE ADD FOREIGN
KEY) causes the applier to fail, if a concurrent DML statement that
violate the new constraint (i.e. a DELETE or UPDATE of record in the
parent table).
For exmaple, the following scenario causes a crash in the applier:
1. ALTER successfully adds FK constraint in node_1
2. On node_2 is UPDATE is in pre_commit() and has certified successfully
3. ALTER is delivered in node_2 and BF aborts DML
4. Applying UPDATE event causes FK violation in node_1
To avoid this situation it is necessary for UPDATE to fail during
certification. And for the UPDATE to fail certfication it is necessary
that ALTER appends certification keys for both the child and the parent
table. Before this patch, ALTER TABLE ADD FK only appended keys for
child table which is ALTERed.
Restricted output for CREATE USER, GRANT, REVOKE and SET PASSWORD
so that it shows only above keywords but not rest of query i.e.
not user or password.
Restricted output for CREATE USER, GRANT, REVOKE and SET PASSWORD
so that it shows only above keywords but not rest of query i.e.
not user or password.
These test can sporadically show mutex deadlock warnings between LOCK_wsrep_thd
and LOCK_thd_data mutexes. This means that these mutexes can be locked in opposite
order by different threads, and thus result in deadlock situation.
To fix such issue, the locking policy of these mutexes should be revised and
enforced to be uniform. However, a quick code review shows that the number of
lock/unlock operations for these mutexes combined is between 100-200, and all these
mutex invocations should be checked/fixed.
On the other hand, it turns out that LOCK_wsrep_thd is used for protecting access to
wsrep variables of THD (wsrep_conflict_state, wsrep_query_state), whereas LOCK_thd_data
protects query, db and mysys_var variables in THD. Extending LOCK_thd_data to protect
also wsrep variables looks like a viable solution, as there should not be a use case
where separate threads need simultaneous access to wsrep variables and THD data variables.
In this commit LOCK_wsrep_thd mutex is refactored to be replaced by LOCK_thd_data.
By bluntly replacing LOCK_wsrep_thd by LOCK_thd_data, will result in double locking
of LOCK_thd_data, and some adjustements have been performed to fix such situations.
fix galera.galera_sst_mysqldump test to work:
* must connect to 127.0.0.1, where mysqld is listening
* disable wsrep_sync_wait in wsrep_sst_mysqldump, otherwise
sst can deadlock
* allow 127.0.0.1 for bind_address and wsrep_sst_receive_address.
(it's useful in tests, or when two nodes are on the same box,
or when nodes are on different boxes, but the connection is
tunelled, or whatever. Don't judge user's setup). MDEV-14070
* don't wait for client connections to die when doing
mysqldump sst. they'll die in a due time, and if needed mysql
will wait on locks until they do. MDEV-14069
Also don't mark it big, to make sure it's sufficiently tested
This test failed to work properly because the fixes it came
with were not merged from upstream.
The test would fail with a spurious ER_LOCK_DEADLOCK error
for a conflict that happened earlier in the test execution,
while wsrep is disabled.
The original fix was to set THD::wsrep_conflict_state only
if wsrep is enabled (see wsrep_thd_set_conflict_state() in
sql/wsrep_mysqld.cc)
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
This preserves const str for constant strings
Other things
- A few variables where changed from LEX_STRING to LEX_CSTRING
- Incident_log_event::Incident_log_event and record_incident where
changed to take LEX_CSTRING* as an argument instead of LEX_STRING
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db
Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
Other things, mainly to get
create_mysqld_error_find_printf_error tool to work:
- Added protection to not include mysqld_error.h twice
- Include "unireg.h" instead of "mysqld_error.h" in server
- Added protection if ER_XX messages are already defined
- Removed wrong calls to my_error(ER_OUTOFMEMORY) as
my_malloc() and my_alloc will do this automatically
- Added missing %s to ER_DUP_QUERY_NAME
- Removed old and wrong calls to my_strerror() when using
MY_ERROR_ON_RENAME (wrong merge)
- Fixed deadlock error message from Galera. Before the extra
information given to ER_LOCK_DEADLOCK was missing because
ER_LOCK_DEADLOCK doesn't provide any extra information.
I kept #ifdef mysqld_error_find_printf_error_used in sql_acl.h
to make it easy to do this kind of check again in the future
Problem:- Gtid are not transferred in Galera Cluster.
Solution:- We need to transfer gtid in the case on either when cluster is
slave/master in async replication. In normal Gtid replication gtid are generated on
recieving node itself and it is always on sync with other nodes. Because galera keeps
node in sync , So all nodes get same no of event groups. So the issue arises when
say galera is slave in async replication.
A
| (Async replication)
D <-> E <-> F {Galera replication}
So what should happen is that all node should apply the master gtid but this does
node happen, becuase node E, F does not recieve gtid from D in write set , So what E(or F)
does is that it applies wsrep_gtid_domain_id, D server-id , E gtid next seq no. This
generated gtid does not always work when say A has different domain id.
So In this commit, on galera node when we see that this event is recieved from master
we simply write Gtid_Log_Event in write_set and send it to other nodes.
LOCK_thd_data was used to protect both THD data and
ensure that the THD is not deleted while it was in use
This patch moves the THD delete protection to LOCK_thd_kill,
which already protects the THD for kill.
The benefits are:
- More well defined what LOCK_thd_data protects
- LOCK_thd_data usage is now much simpler and easier to verify
- Less chance of deadlocks in SHOW PROCESS LIST as there is less
chance of interactions between mutexes
- Remove not needed LOCK_thread_count from
thd_get_error_context_description()
- Fewer mutex taken for thd->awake()
Other things:
- Don't take mysys->var mutex in show processlist to check if thread
is kill marked
- thd->awake() now automatically takes the LOCK_thd_kill mutex
(Simplifies code)
- Apc uses LOCK_thd_kill instead of LOCK_thd_data
Problem:- This crash happens because of thd = NULL , and while checking
for wsrep_on , we no longer check for thd != NULL (MDEV-7955). So this
problem is regression of MDEV-7955. However this patch not only solves
this regression , It solves all regression caused by MDEV-7955 patch.
To get all possible cases when thd can be null , assert(thd)/
assert(trx->mysql_thd) is place just before all wsrep_on and innodb test
suite is run. And the assert which caused failure are removed with a physical
check for thd != NULL. Rest assert are removed. Hopefully this method will
remove all current/potential regression of MDEV-7955.
- Added missing delete thd in bootstrap()
- Delete wrong 'delete thd' in start_wsrep_THD()
- Added missing 'delete thd' in case of SCHEDULER_ONE_THREAD_PER_CONNECTION
- Delete wrong dec_thread_running() in destroy_thd() as it caused thread_running
to be wrong.
- Moved reset_killed() to a normal function to make it easier to debug
- Added check of mutex in wsrep_aborting_thd... functions
For running the Galera tests, the variable my_disable_leak_check
was set to true in order to avoid assertions due to memory leaks
at shutdown.
Some adjustments due to MDEV-13625 (merge InnoDB tests from MySQL 5.6)
were performed. The most notable behaviour changes from 10.0 and 10.1
are the following:
* innodb.innodb-table-online: adjustments for the DROP COLUMN
behaviour change (MDEV-11114, MDEV-13613)
* innodb.innodb-index-online-fk: the removal of a (1,NULL) record
from the result; originally removed in MySQL 5.7 in the
Oracle Bug #16244691 fix
377774689b
* innodb.create-index-debug: disabled due to MDEV-13680
(the MySQL Bug #77497 fix was not merged from 5.6 to 5.7.10)
* innodb.innodb-alter-autoinc: MariaDB 10.2 behaves like MySQL 5.6/5.7,
while MariaDB 10.0 and 10.1 assign different values when
auto_increment_increment or auto_increment_offset are used.
Also MySQL 5.6/5.7 exhibit different behaviour between
LGORITHM=INPLACE and ALGORITHM=COPY, so something needs to be tested
and fixed in both MariaDB 10.0 and 10.2.
* innodb.innodb-wl5980-alter: disabled because it would trigger an
InnoDB assertion failure (MDEV-13668 may need additional effort in 10.2)
wsrep_drop_table_query(): Remove the definition of this ununsed function.
row_upd_sec_index_entry(), row_upd_clust_rec_by_insert():
Evaluate the simplest conditions first. The merge could have slightly
hurt performance by causing extra calls to wsrep_on().
WSREP_LOG increased the stack size of all function, where it was used,
with 1024 bytes, even if it would never called. What's worse is that one
a Mac with Lion 10.7, the stack size was increased with 1024* number of
calls per function.
By moving WSREP_LOG() to a function, the stack size on Lion in function
mysql_execute() was reduced from 18K to 12K. On my main Linux machine
with gcc 5.4 the reduction was from 5k to 4k.