The semisync ack collector hits fd's out-of-bound value assert through
a stack of
/usr/sbin/mysqld(_ZN12Ack_receiver17get_slave_socketsEP6fd_setPj+0x70)[0x7fa3bbe27400]
/usr/sbin/mysqld(_ZN12Ack_receiver3runEv+0x540)[0x7fa3bbe27980]
/usr/sbin/mysqld(ack_receive_handler+0x19)[0x7fa3bbe27a79]
The reason of the failure must be the same as in https://bugs.mysql.com/bug.php?id=79865
whose fixes are applied with minor changes.
Specifically, the semisync ack thread is changed to use poll()
instead of select() on platforms where the former is defined.
On the systems that still use select(), Ack receive thread will generate
an error and semi sync will be switched off. Windows systems is exception
case because on windows this limitation does not exists.
The sustain manual testing with `mysqlslap --concurrency > 1024' in "background" while the slave io thread is restarting multiple times.
and specifically the ack receiving functionality.
Semisync is turned to be static instead of plugin so its functions
are invoked at the same points as RUN_HOOKS.
The RUN_HOOKS and the observer interface remain to be removed by later
patch.
Todo:
React on killed status by repl_semisync_master.wait_after_sync(). Currently
Repl_semi_sync_master::commit_trx does not check the killed status.
There were few bugfixes found that are present in mysql and its unclear
whether/how they are covered. Those include:
Bug#15985893: GTID SKIPPED EVENTS ON MASTER CAUSE SEMI SYNC TIME-OUTS
Bug#17932935 CALLING IS_SEMI_SYNC_SLAVE() IN EACH FUNCTION CALL
HAS BAD PERFORMANCE
Bug#20574628: SEMI-SYNC REPLICATION PERFORMANCE DEGRADES WITH A HIGH NUMBER OF THREADS