Commit graph

21 commits

Author SHA1 Message Date
Brandon Nesterenko
a64991df9d MDEV-4989: Support for GTID in mysqlbinlog
This patch fixes two issues:

First, it fixes test failure due to GTID List events
having inconsistent ordering of domain ids. In
particular, this patch ensures that a GTID list log
event will have its GTIDs ordered by domain id
(ascending) followed by sequence number (ascending).

Second, it fixes an assert which could use an
unintialized variable.

Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
2022-01-31 09:30:23 -07:00
Sachin
0c5d1342ae MDEV-11675 Lag Free Alter On Slave
This commit implements two phase binloggable ALTER.
When a new

      @@session.binlog_alter_two_phase = YES

ALTER query gets logged in two parts, the START ALTER and the COMMIT
or ROLLBACK ALTER. START Alter is written in binlog as soon as
necessary locks have been acquired for the table. The timing is
such that any concurrent DML:s that update the same table are either
committed, thus logged into binary log having done work on the old
version of the table, or will be queued for execution on its new
version.

The "COMPLETE" COMMIT or ROLLBACK ALTER are written at the very point
of a normal "single-piece" ALTER that is after the most of
the query work is done. When its result is positive COMMIT ALTER is
written, otherwise ROLLBACK ALTER is written with specific error
happened after START ALTER phase.
Replication of two-phase binloggable ALTER is
cross-version safe. Specifically the OLD slave merely does not
recognized the start alter part, still being able to process and
memorize its gtid.

Two phase logged ALTER is read from binlog by mysqlbinlog to produce
BINLOG 'string', where 'string' contains base64 encoded
Query_log_event containing either the start part of ALTER, or a
completion part. The Query details can be displayed with `-v` flag,
similarly to ROW format events.  Notice, mysqlbinlog output containing
parts of two-phase binloggable ALTER is processable correctly only by
binlog_alter_two_phase server.

@@log_warnings > 2 can reveal details of binlogging and slave side
processing of the ALTER parts.

The current commit also carries fixes to the following list of
reported bugs:
MDEV-27511, MDEV-27471, MDEV-27349, MDEV-27628, MDEV-27528.

Thanks to all people involved into early discussion of the feature
including Kristian Nielsen, those who helped to design, implement and
test: Sergei Golubchik, Andrei Elkin who took the burden of the
implemenation completion, Sujatha Sivakumar, Brandon
Nesterenko, Alice Sherepa, Ramesh Sivaraman, Jan Lindstrom.
2022-01-27 21:25:07 +02:00
Brandon Nesterenko
79e3ee00fa MDEV-4989: Support for GTID in mysqlbinlog
New Feature:
===========
This commit extends the mariadb-binlog capabilities to allow events
to be filtered by GTID ranges. More specifically, the
--start-position and --stop-position arguments have been extended to
accept values formatted as a list of GTID positions, e.g.
--start-position=0-1-0,1-2-55. The following specific capabilities
are addressed:
   1) GTIDs can be used to filter results on local binlog files
   2) GTIDs can be used to filter results from remote servers
   3) Implemented --gtid-strict-mode that ensures the GTID event
      stream in each domain is monotonically increasing
   4) Added new level of verbosity in mysqlbinlog -vvv to print
      additional diagnostic information/warnings about invalid GTID
      states
   5) For a given GTID range, its start and stop position parameters
      aim to mimic the behaviors of
      CHANGE MASTER TO MASTER_USE_GTID=slave_pos and
      START SLAVE UNTIL master_gtid_pos=<GTID>, respectively. In
      particular, the start-position list expresses a gtid state of
      the server, similarly to how @@global.gtid_slave_pos expresses
      the gtid state of a slave server when connecting to a master
      with MASTER_USE_GTID=slave_pos.
      The GTID start-position list is exclusive and the
      stop-position list is inclusive. This allows users to receive
      events strictly after those that they already have, and is
      useful in  cases of point in (logical) time recovery including
      1) events were received out of order and should be re-sent, or
      2) specifying the gtid state of a slave to get events newer
      than their current state. If a seq_no is 0 for start-position,
      it means to include the entirety of the domain. If a seq_no is
      0 for stop-position, it means to exclude all events from that
      domain. The GTIDs provided in a start position argument must
      match with the GTID state of the first processed log (i.e.
      those listed in the Gtid_list event). If a stop position is
      provided, the events that are output are limited to only those
      with domain ids listed in the argument. When specifying
      combinations of start and stop positions, the following
      behaviors are expected:

[--start-position without --stop-position]: Events that have domain
ids in the start position are output if their seq_no occurs after
the respective start position. Events with domain ids that are
unspecified in the start position list are also output. Note that if
the Gtid_list event of the first binary log is populated (i.e.
non-empty), each domain in the Gtid_list must be present in the
start-position list with a seq_no at or after the listed value.
This behavior mimics how a slave only processes events after the
state provided by @@global.gtid_slave_pos when connecting to a
master with CHANGE MASTER TO MASTER_USE_GTID=slave_pos.

[--stop-position without --start-position]: Output is limited to
only events with both 1) domain ids that are present in the given
stop position list and 2) seq_nos that are less than or equal to
their respective stop GTID. Once all GTIDs in the stop position
list have been processed, the program will stop processing log
files. This behavior mimics how
START SLAVE UNTIL master_gtid_pos=<G>
has a slave only process events with domain ids present in G with
their seq_nos at or before the respective gtid.

[--start-position and --stop-position]: Output consists of the
intersection between the events permitted by both the start and stop
position rules. More concretely, the output can be defined by a
union of the following rules:

  1. For domains which exist in both the start and stop position
     lists, the events which exist in-between these positions
     (exclusive start, inclusive stop) are output
  2. For all other events, the rules of
     [--stop-position without --start-position] are followed

This is due to the implicit filtering within each individual rule.
Even though the start position rule always includes events from
unspecified domains, the stop position rule takes precedence because
it always excludes events from unspecified domains. In other words,
events which the start position rule would have included would then
always be excluded by the stop position rule.

[neither --start-position nor --stop-position]: Events are not
omitted based on GTID positioning; however, --gtid-strict-mode and
-vvv can still analyze gtid correctness for warning and error
reporting.

[repeated specification of --start-position or --stop-position]:
Subsequent specifications of start and stop positions completely
override previous ones. E.g., if invoked as
mysqlbinlog --start-position=<G1> --start-position=<G2> ...
All GTIDs specified in G1 are ignored and only those specified in G2
are used for the start position.

A few additional notes:
 1) this commit squashes together the commits:
f4319661120e-78a9d49907ba

 2) Changed rpl.rpl_blackhole_row_annotate test because it has
out of order GTIDs in its binlog, so I added
--skip-gtid-strict-mode

 3) After all binlog events have been written, the session server
    id and domain id are reset to their values in the global state

Reviewed By:
===========
Andrei Elkin: <andrei.elkin@mariadb.com>
2022-01-26 14:17:21 -07:00
Eric Herman
401ff6994d MDEV-26221: DYNAMIC_ARRAY use size_t for sizes
https://jira.mariadb.org/browse/MDEV-26221
my_sys DYNAMIC_ARRAY and DYNAMIC_STRING inconsistancy

The DYNAMIC_STRING uses size_t for sizes, but DYNAMIC_ARRAY used uint.
This patch adjusts DYNAMIC_ARRAY to use size_t like DYNAMIC_STRING.

As the MY_DIR member number_of_files is copied from a DYNAMIC_ARRAY,
this is changed to be size_t.

As MY_TMPDIR members 'cur' and 'max' are copied from a DYNAMIC_ARRAY,
these are also changed to be size_t.

The lists of plugins and stored procedures use DYNAMIC_ARRAY,
but their APIs assume a size of 'uint'; these are unchanged.
2021-10-19 16:00:26 +03:00
Monty
47010ccffa MDEV-23842 Atomic RENAME TABLE
- Major rewrite of ddl_log.cc and ddl_log.h
  - ddl_log.cc described in the beginning how the recovery works.
  - ddl_log.log has unique signature and is dynamic. It's easy to
    add more information to the header and other ddl blocks while still
    being able to execute old ddl entries.
  - IO_SIZE for ddl blocks is now dynamic. Can be changed without affecting
    recovery of old logs.
  - Code is more modular and is now usable outside of partition handling.
  - Renamed log file to dll_recovery.log and added option --log-ddl-recovery
    to allow one to specify the path & filename.
- Added ddl_log_entry_phase[], number of phases for each DDL action,
  which allowed me to greatly simply set_global_from_ddl_log_entry()
- Changed how strings are stored in log entries, which allows us to
  store much more information in a log entry.
- ddl log is now always created at start and deleted on normal shutdown.
  This simplices things notable.
- Added probes debug_crash_here() and debug_simulate_error() to simply
  crash testing and allow crash after a given number of times a probe
  is executed. See comments in debug_sync.cc and rename_table.test for
  how this can be used.
- Reverting failed table and view renames is done trough the ddl log.
  This ensures that the ddl log is tested also outside of recovery.
- Added helper function 'handler::needs_lower_case_filenames()'
- Extend binary log with Q_XID events. ddl log handling is using this
  to check if a ddl log entry was logged to the binary log (if yes,
  it will be deleted from the log during ddl_log_close_binlogged_events()
- If a DDL entry fails 3 time, disable it. This is to ensure that if
  we have a crash in ddl recovery code the server will not get stuck
  in a forever crash-restart-crash loop.

mysqltest.cc changes:
- --die will now replace $variables with their values
- $error will contain the error of the last failed statement

storage engine changes:
- maria_rename() was changed to be more robust against crashes during
  rename.
2021-05-19 22:54:12 +02:00
Monty
85d6278fed Change replication to use uchar for all buffers instead of char
This change is to get rid of randomly failing tests, especially those
that reads random position of the binary log. From looking at the logs
it's clear that some failures is because of a read char (with value >= 128)
is converted to a big long value. Using uchar everywhere makes this much
less likely to happen.
Another benefit is that a lot of cast of char to uchar could be removed.

Other things:
- Removed some extra space before '=' and '+=' in assignments
- Fixed indentations and lines > 80 characters
- Replace '16' with 'element_size' (from class definition) in
  Gtid_list_log_event()
2021-05-19 22:54:12 +02:00
Monty
a206658b98 Change CHARSET_INFO character set and collaction names to LEX_CSTRING
This change removed 68 explict strlen() calls from the code.

The following renames was done to ensure we don't use the old names
when merging code from earlier releases, as using the new variables
for print function could result in crashes:
- charset->csname renamed to charset->cs_name
- charset->name renamed to charset->coll_name

Almost everything where mechanical changes except:
- Changed to use the new Protocol::store(LEX_CSTRING..) when possible
- Changed to use field->store(LEX_CSTRING*, CHARSET_INFO*) when possible
- Changed to use String->append(LEX_CSTRING&) when possible

Other things:
- There where compiler issues with ensuring that all character set names
  points to the same string: gcc doesn't allow one to use integer constants
  when defining global structures (constant char * pointers works fine).
  To get around this, I declared defines for each character set name
  length.
2021-05-19 22:54:07 +02:00
Marko Mäkelä
8de233af81 Merge 10.4 into 10.5 2021-01-11 16:29:51 +02:00
Sujatha
25ede13611 Merge branch '10.4' into 10.5 2020-09-29 16:59:36 +05:30
Marko Mäkelä
5ff7e68c7e Merge 10.4 into 10.5 2020-09-04 18:44:44 +03:00
Marko Mäkelä
50a11f396a Merge 10.4 into 10.5 2020-08-01 14:42:51 +03:00
Monty
80544a5878 Fixed rpl.rpl_mariadb_slave_capability.result file
The cause was an uninitalized variable on the slave when reading a dummy
event that can only be generated by the test.

Fixed by ensuring that flag2 is always initialized.
Fixed also some indentation issues and improved comments.
2020-03-25 16:30:53 +02:00
Monty
6a9e24d046 Added support for replication for S3
MDEV-19964 S3 replication support

Added new configure options:
s3_slave_ignore_updates
"If the slave has shares same S3 storage as the master"

s3_replicate_alter_as_create_select
"When converting S3 table to local table, log all rows in binary log"

This allows on to configure slaves to have the S3 storage shared or
independent from the master.

Other thing:
Added new session variable '@@sql_if_exists' to force IF_EXIST to DDL's.
2020-03-24 21:00:02 +02:00
Marko Mäkelä
5203bc10f1 Merge 10.4 into 10.5 2020-03-21 11:37:10 +02:00
Andrei Elkin
c8ae357341 MDEV-742 XA PREPAREd transaction survive disconnect/server restart
Lifted long standing limitation to the XA of rolling it back at the
transaction's
connection close even if the XA is prepared.

Prepared XA-transaction is made to sustain connection close or server
restart.
The patch consists of

    - binary logging extension to write prepared XA part of
      transaction signified with
      its XID in a new XA_prepare_log_event. The concusion part -
      with Commit or Rollback decision - is logged separately as
      Query_log_event.
      That is in the binlog the XA consists of two separate group of
      events.

      That makes the whole XA possibly interweaving in binlog with
      other XA:s or regular transaction but with no harm to
      replication and data consistency.

      Gtid_log_event receives two more flags to identify which of the
      two XA phases of the transaction it represents. With either flag
      set also XID info is added to the event.

      When binlog is ON on the server XID::formatID is
      constrained to 4 bytes.

    - engines are made aware of the server policy to keep up user
      prepared XA:s so they (Innodb, rocksdb) don't roll them back
      anymore at their disconnect methods.

    - slave applier is refined to cope with two phase logged XA:s
      including parallel modes of execution.

This patch does not address crash-safe logging of the new events which
is being addressed by MDEV-21469.

CORNER CASES: read-only, pure myisam, binlog-*, @@skip_log_bin, etc

Are addressed along the following policies.
1. The read-only at reconnect marks XID to fail for future
   completion with ER_XA_RBROLLBACK.

2. binlog-* filtered XA when it changes engine data is regarded as
   loggable even when nothing got cached for binlog.  An empty
   XA-prepare group is recorded. Consequent Commit-or-Rollback
   succeeds in the Engine(s) as well as recorded into binlog.

3. The same applies to the non-transactional engine XA.

4. @@skip_log_bin=OFF does not record anything at XA-prepare
   (obviously), but the completion event is recorded into binlog to
   admit inconsistency with slave.

The following actions are taken by the patch.

At XA-prepare:
   when empty binlog cache - don't do anything to binlog if RO,
   otherwise write empty XA_prepare (assert(binlog-filter case)).

At Disconnect:
   when Prepared && RO (=> no binlogging was done)
     set Xid_cache_element::error := ER_XA_RBROLLBACK
     *keep* XID in the cache, and rollback the transaction.

At XA-"complete":
   Discover the error, if any don't binlog the "complete",
   return the error to the user.

Kudos
-----
Alexey Botchkov took to drive this work initially.
Sergei Golubchik, Sergei Petrunja, Marko Mäkelä provided a number of
good recommendations.
Sergei Voitovich made a magnificent review and improvements to the code.
They all deserve a bunch of thanks for making this work done!
2020-03-14 22:45:48 +02:00
Sergei Golubchik
c1c5222cae cleanup: PSI key is *always* the first argument 2020-03-10 19:24:23 +01:00
Sergei Golubchik
7c58e97bf6 perfschema memory related instrumentation changes 2020-03-10 19:24:22 +01:00
Marko Mäkelä
d04f2de80a Merge 10.4 into 10.5 2019-10-11 08:41:36 +03:00
Sachin Setiya
fc33c3cda5 MDEV-20591 Wrong Number of rows in mysqlbinlog output
calc_field_event_length should accurately calculate the size of BLOB type
fields, Instead of returning just the bytes taken by length it should return
length bytes + actual length.
2019-10-08 16:54:48 +05:30
Sachin
967c14c04e MDEV-20477 Merge binlog extended metadata support from the upstream
Cherry-pick the commits the mysql and some changes.
WL#4618 RBR: extended table metadata in the binary log

This patch extends Table Map Event. It appends some new fields for
more metadata. The new metadata includes:
- Signedness of Numberic Columns
- Character Set of Character Columns and Binary Columns
- Column Name
- String Value of SET Columns
- String Value of ENUM Columns
- Primary Key
- Character Set of SET Columns and ENUM Columns
- Geometry Type

Some of them are optional, the patch introduces a GLOBAL system
variable to control it. It is binlog_row_metadata.
- Scope:   GLOBAL
- Dynamic: Yes
- Type:    ENUM
- Values:  {NO_LOG, MINIMAL, FULL}
- Default: NO_LOG
  Only Signedness, character set and geometry type are logged if it is MINIMAL.
  Otherwise all of them are logged.

Also add a binlog_type_info() to field, So that we can have extract
relevant binlog info from field.
2019-09-11 15:09:35 +05:30
Alexander Barkov
55a2ca3e6a MDEV-19550 Move specific parts of log_event.cc to log_event_client.cc and log_event_server.cc 2019-05-23 05:23:42 +04:00