Commit graph

17 commits

Author SHA1 Message Date
Andrei
9c43343213 MDEV-32365 detailize the semisync replication magic number error
Semisync ack (master side) receiver thread is made to report
details of faced errors.
In case of 'magic byte' error, a hexdump of the received packet
is always (level) NOTEd into the error log.
In other cases an exact server level error is print out
as a warning (as it may not be critical) under log_warnings > 2.

An MTR test added for the magic byte error. For others existing mtr
tests cover that, provided log_warnings > 2 is set.
2023-10-26 20:24:44 +03:00
Anel Husakovic
c596ad734d MDEV-30269: Remove rpl_semi_sync_[slave,master] usage in code
- Description:
  - Before 10.3.8 semisync was a plugin that is built into the server with
    MDEV-13073,starting with commit cbc71485e2.
    There are still some usage of `rpl_semi_sync_master` in mtr.
Note:
  - To recognize the replica in the `dump_thread`, replica is creating
    local variable `rpl_semi_sync_slave` (the keyword of plugin) in
    function `request_transmit`, that is catched by primary in
    `is_semi_sync_slave()`. This is the user variable and as such not
    related to the obsolete plugin.

 - Found in `sys_vars.all_vars` and `rpl_semi_sync_wait_point` tests,
   usage of plugins `rpl_semi_sync_master`, `rpl_semi_sync_slave`.
   The former test is disabled by default (`sys_vars/disabled.def`)
   and marked as `obsolete`, however this patch will remove the queries.

- Add cosmetic fixes to semisync codebase

Reviewer: <brandon.nesterenko@mariadb.com>
Closes PR #2528, PR #2380
2023-03-23 13:39:46 +01:00
Marko Mäkelä
e3fdabd501 MDEV-29613 fixup: clang -Wunused-but-set-variable 2022-09-26 15:16:51 +03:00
Brandon Nesterenko
a83c7ab1ea MDEV-11853: semisync thread can be killed after sync binlog but before ACK in the sync state
Problem:
========
If a primary is shutdown during an active semi-sync connection
during the period when the primary is awaiting an ACK, the primary
hard kills the active communication thread and does not ensure the
transaction was received by a replica. This can lead to an
inconsistent replication state.

Solution:
========
During shutdown, the primary should wait for an ACK or timeout
before hard killing a thread which is awaiting a communication. We
extend the `SHUTDOWN WAIT FOR SLAVES` logic to identify and ignore
any threads waiting for a semi-sync ACK in phase 1. Then, before
stopping the ack receiver thread, the shutdown is delayed until all
waiting semi-sync connections receive an ACK or time out. The
connections are then killed in phase 2.

Notes:
 1) There remains an unresolved corner case that affects this
patch. MDEV-28141: Slave crashes with Packets out of order when
connecting to a shutting down master. Specifically, If a slave is
connecting to a master which is actively shutting down, the slave
can crash with a "Packets out of order" assertion error. To get
around this issue in the MTR tests, the primary will wait a small
amount of time before phase 1 killing threads to let the replicas
safely stop (if applicable).
 2) This patch also fixes MDEV-28114: Semi-sync Master ACK Receiver
Thread Can Error on COM_QUIT

Reviewed By
============
Andrei Elkin <andrei.elkin@mariadb.com>
2022-04-22 12:59:54 -06:00
Brandon Nesterenko
32ab6219be MDEV-25580: rpl.rpl_semi_sync_slave_compressed_protocol crashes because of wrong packet
Problem:
========
When both semi-sync and slave compression are enabled, the numbering
on packet headers can become out of sync between the primary and
replica servers. More specifically, after the master flushes its
write, it should increment the counters that track packets. The
bug is such that the master only updates the normal packet counter
and leaves the compressed packet counter alone.

Solution:
========
After the master flushes, additionally increment the compressed
packet counter.

Reviewed By:
============
Andrei Elkin: <andrei.elkin@mariadb.com>
2022-03-24 07:25:22 -06:00
Michael Widenius
716d396bb3 Remove \n from DBUG_PRINT statements 2019-10-21 18:41:58 +03:00
Sergey Vojtovich
779fb636da Revert THD::THD(skip_global_sys_var_lock) argument
Originally introduced by e972125f1 to avoid harmless wait for
LOCK_global_system_variables in a newly created thread, which creation was
initiated by system variable update.

At the same time it opens dangerous hole, when system variable update
thread already released LOCK_global_system_variables and ack_receiver
thread haven't yet completed new THD construction. In this case THD
constructor goes completely unprotected.

Since ack_receiver.stop() waits for the thread to go down, we have to
temporarily release LOCK_global_system_variables so that it doesn't
deadlock with ack_receiver.run(). Unfortunately it breaks atomicity
of rpl_semi_sync_master_enabled updates and makes them not serialized.

LOCK_rpl_semi_sync_master_enabled was introduced to workaround the above.
TODO: move ack_receiver start/stop into repl_semisync_master
enable_master/disable_master under LOCK_binlog protection?

Part of MDEV-14984 - regression in connect performance
2019-05-03 16:46:11 +04:00
Andrei Elkin
42c58b87da MDEV-18096 The server would crash when has configs rpl_semi_sync_master_enabled = OFF rpl_semi_sync_master_wait_no_slave = OFF
The patch fixes a fired assert in the semisync master module. The assert
caught attempt to switch semisync off (per rpl_semi_sync_master_wait_no_slave = OFF)
when it was not even initialized (per rpl_semi_sync_master_enabled = OFF).
The switching-off execution branch is relocated under one that executes
enable_master() first.

A minor cleaup is done to remove the int return from two functions that
did not return anything but an error which could not happen in the functions.
2019-04-19 18:48:16 +03:00
Monty
30ebc3ee9e Add likely/unlikely to speed up execution
Added to:
- if (error)
- Lex
- sql_yacc.yy and sql_yacc_ora.yy
- In header files to alloc() calls
- Added thd argument to thd_net_is_killed()
2018-05-07 00:07:32 +03:00
Vladislav Vaintroub
6c279ad6a7 MDEV-15091 : Windows, 64bit: reenable and fix warning C4267 (conversion from 'size_t' to 'type', possible loss of data)
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.

This fix excludes rocksdb, spider,spider, sphinx and connect for now.
2018-02-06 12:55:58 +00:00
Monty
a7e352b54d Changed database, tablename and alias to be LEX_CSTRING
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db

Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
  for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
  correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
  handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
  NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
2018-01-30 21:33:55 +02:00
Andrei Elkin
529120e1cb MDEV-13073. This patch is a followup of the previous one to convert the trailing underscore identifier to mariadb standard. For identifier representing class private members the underscore is replaced with a m_ prefix. Otherwise _ is just removed. 2017-12-18 13:43:38 +02:00
Andrei Elkin
f279d3c43a MDEV-13073. This part converts the Ali patch`s identifiers to the MariaDB standard. Also some renaming is done as well as white spaces removal. 2017-12-18 13:43:38 +02:00
Andrei Elkin
6a84e3407d MDEV-13073. This patch replaces semisync's native function_enter,exit
and its custom trace faciltiy with standard DBUG_ based equivalents.
2017-12-18 13:43:38 +02:00
Andrei Elkin
74b35b6874 MDEV-13073. This part patch weeds out RUN_HOOK from the server as semisync
is defined statically. Consequently the observer interfaces are removed
as well.
2017-12-18 13:43:37 +02:00
Andrei Elkin
e972125f11 MDEV-13073 This part merges the Ali semisync related changes
and specifically the ack receiving functionality.
Semisync is turned to be static instead of plugin so its functions
are invoked at the same points as RUN_HOOKS.
The RUN_HOOKS and the observer interface remain to be removed by later
patch.

Todo:
  React on killed status by repl_semisync_master.wait_after_sync(). Currently
  Repl_semi_sync_master::commit_trx does not check the killed status.

  There were few bugfixes found that are present in mysql and its unclear
  whether/how they are covered. Those include:

  Bug#15985893: GTID SKIPPED EVENTS ON MASTER CAUSE SEMI SYNC TIME-OUTS
  Bug#17932935 CALLING IS_SEMI_SYNC_SLAVE() IN EACH FUNCTION CALL
                 HAS BAD PERFORMANCE
  Bug#20574628: SEMI-SYNC REPLICATION PERFORMANCE DEGRADES WITH A HIGH NUMBER OF THREADS
2017-12-18 13:43:37 +02:00
Monty
2e53b96a0a Moved semisync from a plugin to normal server
Part of MDEV-13073 AliSQL Optimize performance of semisync

Did the following renames to match other similar variables
key_ss_mutex_LOCK_binlog_       > key_LOCK_bing
key_ss_cond_COND_binlog_send_  -> key_COND_binlog_send
COND_binlog_send_              -> COND_binlog_send
LOCK_binlog_                   -> LOCK_binlog

debian/mariadb-server-10.2.install does not install semisync libs.
2017-12-18 13:43:36 +02:00
Renamed from plugin/semisync/semisync_master.cc (Browse further)