error code 12701 is already included in default value, but other plugin specific error codes are ignored because of checking with ER_ERROR_LAST. ER_ERROR_LAST does not include plugin specific error codes. So I just removed it for fixing this issue.
Problem:
========
When attempting to delay a Slave attached with GTID, there appears to be an
extra delay applied initially. For example, this output reflects a Slave that is
already delayed by 43200 seconds. When switching to GTID replication,
replication is paused until SQL_Remaining_Delay counts down to 0:
CHANGE MASTER TO master_use_gtid=current_pos; CHANGE MASTER TO
MASTER_DELAY=43200;
Seconds_Behind_Master: 44847
Using_Gtid: Current_Pos
SQL_Delay: 43200
SQL_Remaining_Delay: 43089
Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master
executed event
Analysis:
=========
When slave initiates a GTID based connection request to master, the master sends
two GTID_LIST events. The first one is actual GTID_LIST event and the second
one is a fake GTID_LIST event. This is sent by master to provide its current
binlary log file position. The fake GTID_LIST events will have their ev->when=0.
'when' (the timestamp) is set to 0 so that slave could distinguish between real
and fake Rotate events.
On slave side when MASTER_DELAY is configured to "X" the applier will ensure
that there is a time delay of "X" seconds before the event is applied.
General behaviour of MASTER_DELAY example:-
Master
timestamp of event e1=10
timestamp of event e2=11
On slave MASTER_DELAY=5
Event e1 will be applied at = 15
e2 will be applied at =16
In bug scenario:-
On Master: With GTIDs
timestamp of event e1=10
timestamp of event e2=0
On Slave:
e1 will be applied at = 10 + 5 =15
For e2, since "e2->when=0" e2->when is set to current timestamp.
i.e since the e2->when and current timestamp on slave is the same applier waits
for additional master_delay=5 seconds. the ev->when contributes to
"rli->last_master_timestamp".
rli->last_master_timestamp= ev->when + (time_t) ev->exec_time;
Fake events should not update the "ev->when" to "current timestamp" on slave.
Fix:
===
Remove the assignment of current timestamp to "ev->when" when "ev->when=0".
Removed redundant initialisation in unireg_init(): already done by
mysql_init_variables().
Slave threads already check THD::killed, which eliminates the need to
check abort_loop.
Removed unused wsrep_kill_mysql().
In contrast to thread_count, which is decremented by THD destructor,
this one was most probably intended to be decremented after all THD
destructors are done.
THD_count class was added to achieve similar effect with thread_count.
Aim is to reduce usage of LOCK_thread_count and COND_thread_count.
Part of MDEV-15135.
Implemented and integrated THD_list as a replacement for the global
thread list. It uses own mutex instead of LOCK_thread_count for THD
list protection.
Removed unused first_global_thread() and next_global_thread().
delayed_insert_threads is now protected by LOCK_delayed_insert. Although
this patch doesn't fix very wrong synchronization of this variable.
After this patch there are only 2 legitimate uses of LOCK_thread_count
left, both in mysqld.cc: thread_count and ready_to_exit.
Aim is to reduce usage of LOCK_thread_count and COND_thread_count.
Part of MDEV-15135.
This patch changes how old rows in mysql.gtid_slave_pos* tables are deleted.
Instead of doing it as part of every replicated transaction in
record_gtid(), it is done periodically (every @@gtid_cleanup_batch_size
transaction) in the slave background thread.
This removes the deletion step from the replication process in SQL or worker
threads, which could speed up replication with many small transactions. It
also decreases contention on the global mutex LOCK_slave_state. And it
simplifies the logic, eg. when a replicated transaction fails after having
deleted old rows.
With this patch, the deletion of old GTID rows happens asynchroneously and
slightly non-deterministic. Thus the number of old rows in
mysql.gtid_slave_pos can temporarily exceed @@gtid_cleanup_batch_size. But
all old rows will be deleted eventually after sufficiently many new GTIDs
have been replicated.
There was a failure in rpl_delayed_slave after recent MDEV-14528 commit.
The parallel applier should not set its
Relay_log::last_master_timestamp from Format-descriptor log event.
The latter may reflect a deep past so Seconds-behind-master will be
computed through it and displayed all time while the first possibly
"slow" group of events is executed.
The main MDEV-14528 is refined, rpl_delayed_slave now passes also
in the parallel mode.
When replicated events are from Master unaware of MariaDB GTID their handling by
the Parallel slave misses Seconds_Behind_Master updating.
In the bug condition the Show-Slave-Status' field remains unchanged.
Because in such case event execution is sequential the bug
is fixed with deploying the same logics as in the explicit single-threaded mode
with is to set
Relay_log_event::last_master_timestamp
member early at the end of event reading from the relay log.
Replicated transaction extra gtid statement on slave failed to specify
an engine gtid_slave_pos name correctly. In case lower-case-table-names > 0
the InnoDB table name was generated to reproduce the lower-case-table-names=0 version
which is of mixed cases.
In rpl.rpl_mdev12179 test run this triggered a failure to DROP table which
was due to the innodb table handle was not closed:
InnoDB: Waited XYZ seconds for ref-count on table: `mysql`.`gtid_slave_pos_innodb`
on windows.
The closing issue was caused by having the table registered twice in the table cache,
for its lower- and mixed- case name versions. The DROP-table handler closed only
only one of the cache item to leave the 2nd one active.
(On Linux a failure occurs earlier at attempt to open an expected lower-cased table:
Last_Error: Error during XID COMMIT: failed to update GTID state in mysql.gtid_slave_pos: 1146: Table 'mysql.gtid_slave_pos_InnoDB' doesn't exist
but the table's name as the message shows is not in the right case).
Fixed with consulting lower-case-table-names when the engine gtid-slave-pos table
is created.
Note the lower-case-table-names=a-value created table will not recognized when next
the lower case option changes to a different value.
In 10.4 a follow-up patch is going to lowercase gtid-slave-pos autocreated table
at once at their origination, and a warning is issued in the 10.3 current patch.
Introduced new alter algorithm type called NOCOPY & INSTANT for
inplace alter operation.
NOCOPY - Algorithm refuses any alter operation that would
rebuild the clustered index. It is a subset of INPLACE algorithm.
INSTANT - Algorithm allow any alter operation that would
modify only meta data. It is a subset of NOCOPY algorithm.
Introduce new variable called alter_algorithm. The values are
DEFAULT(0), COPY(1), INPLACE(2), NOCOPY(3), INSTANT(4)
Message to deprecate old_alter_table variable and make it alias
for alter_algorithm variable.
alter_algorithm variable for slave is always set to default.
These test can sporadically show mutex deadlock warnings between LOCK_wsrep_thd
and LOCK_thd_data mutexes. This means that these mutexes can be locked in opposite
order by different threads, and thus result in deadlock situation.
To fix such issue, the locking policy of these mutexes should be revised and
enforced to be uniform. However, a quick code review shows that the number of
lock/unlock operations for these mutexes combined is between 100-200, and all these
mutex invocations should be checked/fixed.
On the other hand, it turns out that LOCK_wsrep_thd is used for protecting access to
wsrep variables of THD (wsrep_conflict_state, wsrep_query_state), whereas LOCK_thd_data
protects query, db and mysys_var variables in THD. Extending LOCK_thd_data to protect
also wsrep variables looks like a viable solution, as there should not be a use case
where separate threads need simultaneous access to wsrep variables and THD data variables.
In this commit LOCK_wsrep_thd mutex is refactored to be replaced by LOCK_thd_data.
By bluntly replacing LOCK_wsrep_thd by LOCK_thd_data, will result in double locking
of LOCK_thd_data, and some adjustements have been performed to fix such situations.
Modern compilers (such as GCC 8) emit warnings that the
'register' keyword is deprecated and not valid C++17.
Let us remove most use of the 'register' keyword.
Code in 'extra/' is not touched.
In this commit we are adding three more status variable to SHOW SLAVE
STATUS. Slave_DDL_Events and Slave_Non_Transactional_Events.
Slave_DDL_Groups:- This status variable counts the occurrence of DDL
statements
Slave_Non_Transactional_Groups:- This variable count the occurrence
of non-transnational event group.
Slave_Transactional_Groups:- This variable count the occurrence
of transnational event group.
Patch Credit:- Kristian Nielsen
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db
Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
commit 3dc3ab1a30 introduced
Rows_event_tracker, using a mismatch of size_t (the native
register width) and my_off_t (the file offset width, usually
64 bits). Use my_off_t both in member fields and member functions.
Problems --------
The slave io thread did not conduct integrity check
for a group of row-based events. Specifically it tolerates missed
terminal block event that must be flagged with STMT_END. Failure to
react on its loss can confuse the applier thread in various ways.
Another potential issue was that there were no check of impossible
second in row Gtid-log-event while the slave io thread is receiving
to be skipped events after reconnect.
Fixes
-----
The slave io thread is made by this patch to track the rows event
STMT_END status.
Whenever at next event reading the IO thread finds out that a preceding
Rows event did not actually had the flag, an
explicit error is issued.
Replication can be resumed after the source of failure is eliminated,
see a provided test.
Note that currently the row-based group integrity check excludes
the compressed version 2 Rows events (which are not generated by MariaDB
master).
Its uncompressed counterpart is manually tested.
The 2nd issue is covered to produce an error in case the io thread
receives a successive Gtid_log_event while it is post-reconnect
skipping.
Other things, mainly to get
create_mysqld_error_find_printf_error tool to work:
- Added protection to not include mysqld_error.h twice
- Include "unireg.h" instead of "mysqld_error.h" in server
- Added protection if ER_XX messages are already defined
- Removed wrong calls to my_error(ER_OUTOFMEMORY) as
my_malloc() and my_alloc will do this automatically
- Added missing %s to ER_DUP_QUERY_NAME
- Removed old and wrong calls to my_strerror() when using
MY_ERROR_ON_RENAME (wrong merge)
- Fixed deadlock error message from Galera. Before the extra
information given to ER_LOCK_DEADLOCK was missing because
ER_LOCK_DEADLOCK doesn't provide any extra information.
I kept #ifdef mysqld_error_find_printf_error_used in sql_acl.h
to make it easy to do this kind of check again in the future
and specifically the ack receiving functionality.
Semisync is turned to be static instead of plugin so its functions
are invoked at the same points as RUN_HOOKS.
The RUN_HOOKS and the observer interface remain to be removed by later
patch.
Todo:
React on killed status by repl_semisync_master.wait_after_sync(). Currently
Repl_semi_sync_master::commit_trx does not check the killed status.
There were few bugfixes found that are present in mysql and its unclear
whether/how they are covered. Those include:
Bug#15985893: GTID SKIPPED EVENTS ON MASTER CAUSE SEMI SYNC TIME-OUTS
Bug#17932935 CALLING IS_SEMI_SYNC_SLAVE() IN EACH FUNCTION CALL
HAS BAD PERFORMANCE
Bug#20574628: SEMI-SYNC REPLICATION PERFORMANCE DEGRADES WITH A HIGH NUMBER OF THREADS