MDEV-25336 Parallel replication causes failed assert while restarting

Problem:- When slave is shutdown, we will get this assertion failure
sql/sql_list.h:642: void ilink::assert_linked(): Assertion `prev != 0
&& next != 0' failed.

Solution:- In close_connections when we call threads.get() it resets to
prev and next to NULL. And in parallel worker thread(handle_rpl_parallel_thread)
calls unlink_not_visible_thd() which assert on prev and next being not NULL.
.unlink_not_visible_thd() should be always called first before threads.get()
is called. To make sure worker calls unlink_not_visible_thd() in
slave_prepare_for_shutdown() we are deactivating the  worker thread pool
which in turn will close all worker threads. Since this is already done in 10.4
and 10.5 I am backPorting MDEV-20821 and MDEV-22370 to 10.2. Mdev-22370
is improving the MDEV-20821 patch.
This commit is contained in:
Sachin Kumar 2021-04-14 10:56:12 +01:00
parent 355dc74b76
commit e607f3398c

View file

@ -4,6 +4,10 @@
# hang when the parallel workers were idle.
# The bug reported scenario is extented to cover the multi-sources case as well as
# checking is done for both the idle and busy workers cases.
#
# MDEV-25336 Parallel replication causes failed assert while restarting
# Since this test case involves slave restart this will help in testing
# Mdev-25336 too.
--source include/have_innodb.inc
--source include/have_binlog_format_mixed.inc
@ -26,7 +30,7 @@ select @@global.slave_parallel_workers as two;
# At this point worker threads have no assignement.
# Shutdown must not hang.
# In 10.2/10.3 there should not be any assert failure `prev != 0 && next != 0'
--connection server_3
--write_file $MYSQLTEST_VARDIR/tmp/mysqld.3.expect
wait
@ -75,6 +79,7 @@ insert into t1 values (1);
--connection server_3
--sync_with_master 0,''
# In 10.2/10.3 there should not be any assert failure `prev != 0 && next != 0'
# At this point worker threads have no assignement.
# Shutdown must not hang.
@ -117,6 +122,7 @@ insert into t1 values (2);
insert into t2 values (2);
# In 10.2/10.3 there should not be any assert failure `prev != 0 && next != 0'
# At this point there's a good chance the worker threads are busy.
# SHUTDOWN must proceed without any delay as above.
--connection server_3