Commit graph

2690 commits

Author SHA1 Message Date
Sujatha
c7cdd049b5 MDEV-22451: SIGSEGV in __memmove_avx_unaligned_erms/memcpy from _my_b_write on CREATE after RESET MASTER
Merge branch '10.2' into 10.3
2020-05-20 21:02:39 +05:30
Sujatha
450a5b33a2 MDEV-22451: SIGSEGV in __memmove_avx_unaligned_erms/memcpy from _my_b_write on CREATE after RESET MASTER
Merge branch '10.1' into 10.2
2020-05-20 20:49:04 +05:30
Sujatha
836d708997 MDEV-22451: SIGSEGV in __memmove_avx_unaligned_erms/memcpy from _my_b_write on CREATE after RESET MASTER
Analysis:
========
RESET MASTER TO # command deletes all binary log files listed in the index
file, resets the binary log index file to be empty, and creates a new binary
log with number #. When the user provided binary log number is greater than
the max allowed value '2147483647' server fails to generate a new binary log.
The RESET MASTER statement marks the binlog closure status as
'LOG_CLOSE_TO_BE_OPENED' and exits. Statements which follow RESET MASTER
try to write to binary log they find the log_state != LOG_CLOSED and
proceed to write to binary log cache and it results in crash.

Fix:
===
During MYSQL_BIN_LOG open, if generation of new binary log name fails then the
"log_state" needs to be marked as "LOG_CLOSED". With this further statements
will find binary log as closed and they will skip writing to the binary log.
2020-05-20 17:42:28 +05:30
Marko Mäkelä
2e12d471ea Merge 10.2 into 10.3 2020-04-27 14:24:41 +03:00
Marko Mäkelä
c06845d6f0 Merge 10.1 into 10.2 2020-04-27 13:28:13 +03:00
Marko Mäkelä
6be05ceb05 MDEV-22203: WSREP_ON is unnecessarily expensive to evaluate
This is a backport of the applicable part of
commit 93475aff8d and
commit 2c39f69d34
from 10.4.

Before 10.4 and Galera 4, WSREP_ON is a macro that points to
a global Boolean variable, so it is not that expensive to
evaluate, but we will add an unlikely() hint around it.

WSREP_ON_NEW: Remove. This macro was introduced in
commit c863159c32
when reverting WSREP_ON to its previous definition.

We replace some use of WSREP_ON with WSREP(thd), like it was done
in 93475aff8d. Note: the macro
WSREP() in 10.1 is equivalent to WSREP_NNULL() in 10.4.

Item_func_rand::seed_random(): Avoid invoking current_thd
when WSREP is not enabled.
2020-04-27 09:40:51 +03:00
Marko Mäkelä
fbe2712705 Merge 10.4 into 10.5
The functional changes of commit 5836191c8f
(MDEV-21168) are omitted due to MDEV-742 having addressed the issue.
2020-04-25 21:57:52 +03:00
Jan Lindström
93475aff8d MDEV-22203: WSREP_ON is unnecessarily expensive to evaluate
Replaced WSREP_ON macro by single global variable WSREP_ON
that is then updated at server statup and on wsrep_on and
wsrep_provider update functions.
2020-04-24 13:12:46 +03:00
Marko Mäkelä
84db10f27b Merge 10.2 into 10.3 2020-04-15 09:56:03 +03:00
Marko Mäkelä
ccc06931c3 Merge 10.4 into 10.5 2020-04-08 10:36:41 +03:00
Daniele Sciascia
bdcecfa22c MDEV-22021: Galera database could get inconsistent with rollback to savepoint
When binlog is disabled, WSREP will not behave correctly when
SAVEPOINT ROLLBACK is executed and we will not rollback transaction.
2020-03-31 14:18:21 +03:00
mkaruza
2d16452a31 MDEV-22021: Galera database could get inconsistent with rollback to savepoint
When binlog is disabled, WSREP will not behave correctly when
SAVEPOINT ROLLBACK is executed since we don't register handlers for such case.
Fixed by registering WSREP handlerton for SAVEPOINT related commands.
2020-03-31 09:59:37 +03:00
Monty
120b73a069 Speed up writing to encrypted binlogs
MDEV-21604

Added "virtual" low level write function encrypt_or_write that is set
to point to either normal or encrypted write functions.

This patch also fixes a possible memory leak if writing to binary log fails.
2020-03-24 21:00:03 +02:00
Monty
91ab42a823 Clean up and speed up interfaces for binary row logging
MDEV-21605 Clean up and speed up interfaces for binary row logging
MDEV-21617 Bug fix for previous version of this code

The intention is to have as few 'if' as possible in ha_write() and
related functions. This is done by pre-calculating once per statement the
row_logging state for all tables.

Benefits are simpler and faster code both when binary logging is disabled
and when it's enabled.

Changes:
- Added handler->row_logging to make it easy to check it table should be
  row logged. This also made it easier to disabling row logging for system,
  internal and temporary tables.
- The tables row_logging capabilities are checked once per "statements
  that updates tables" in THD::binlog_prepare_for_row_logging() which
  is called when needed from THD::decide_logging_format().
- Removed most usage of tmp_disable_binlog(), reenable_binlog() and
  temporary saving and setting of thd->variables.option_bits.
- Moved checks that can't change during a statement from
  check_table_binlog_row_based() to check_table_binlog_row_based_internal()
- Removed flag row_already_logged (used by sequence engine)
- Moved binlog_log_row() to a handler::
- Moved write_locked_table_maps() to THD::binlog_write_table_maps() as
  most other related binlog functions are in THD.
- Removed binlog_write_table_map() and binlog_log_row_internal() as
  they are now obsolete as 'has_transactions()' is pre-calculated in
  prepare_for_row_logging().
- Remove 'is_transactional' argument from binlog_write_table_map() as this
  can now be read from handler.
- Changed order of 'if's in handler::external_lock() and wsrep_mysqld.h
  to first evaluate fast and likely cases before more complex ones.
- Added error checking in ha_write_row() and related functions if
  binlog_log_row() failed.
- Don't clear check_table_binlog_row_based_result in
  clear_cached_table_binlog_row_based_flag() as it's not needed.
- THD::clear_binlog_table_maps() has been replaced with
  THD::reset_binlog_for_next_statement()
- Added 'MYSQL_OPEN_IGNORE_LOGGING_FORMAT' flag to open_and_lock_tables()
  to avoid calculating of binary log format for internal opens. This flag
  is also used to avoid reading statistics tables for internal tables.
- Added OPTION_BINLOG_LOG_OFF as a simple way to turn of binlog temporary
  for create (instead of using THD::sql_log_bin_off.
- Removed flag THD::sql_log_bin_off (not needed anymore)
- Speed up THD::decide_logging_format() by remembering if blackhole engine
  is used and avoid a loop over all tables if it's not used
  (the common case).
- THD::decide_logging_format() is not called anymore if no tables are used
  for the statement. This will speed up pure stored procedure code with
  about 5%+ according to some simple tests.
- We now get annotated events on slave if a CREATE ... SELECT statement
  is transformed on the slave from statement to row logging.
- In the original code, the master could come into a state where row
  logging is enforced for all future events if statement could be used.
  This is now partly fixed.

Other changes:
- Ensure that all tables used by a statement has query_id set.
- Had to restore the row_logging flag for not used tables in
  THD::binlog_write_table_maps (not normal scenario)
- Removed injector::transaction::use_table(server_id_type sid, table tbl)
  as it's not used.
- Cleaned up set_slave_thread_options()
- Some more DBUG_ENTER/DBUG_RETURN, code comments and minor indentation
  changes.
- Ensure we only call THD::decide_logging_format_low() once in
  mysql_insert() (inefficiency).
- Don't annotate INSERT DELAYED
- Removed zeroing pos_in_table_list in THD::open_temporary_table() as it's
  already 0
2020-03-24 21:00:03 +02:00
Monty
6a9e24d046 Added support for replication for S3
MDEV-19964 S3 replication support

Added new configure options:
s3_slave_ignore_updates
"If the slave has shares same S3 storage as the master"

s3_replicate_alter_as_create_select
"When converting S3 table to local table, log all rows in binary log"

This allows on to configure slaves to have the S3 storage shared or
independent from the master.

Other thing:
Added new session variable '@@sql_if_exists' to force IF_EXIST to DDL's.
2020-03-24 21:00:02 +02:00
Marko Mäkelä
5203bc10f1 Merge 10.4 into 10.5 2020-03-21 11:37:10 +02:00
Daniele Sciascia
9394cc8914
MDEV-21675: Data inconsistency after multirow insert rollback (#1474)
* Remove dead code

* MDEV-21675 Data inconsistency after multirow insert rollback

This patch fixes data inconsistencies that happen after rollback of
multirow inserts, with binlog disabled.
For example, statements such as `INSERT INTO t1 VALUES (1,'a'),(1,'b')`
that fail with duplicate key error. In such cases the whole statement
is rolled back. However, with wsrep_emulate_binlog in effect, the
IO_CACHE would not be truncated, and the pending rows events would be
replicated to the rest of the cluster. In the above example, it would
result in row (1,'a') being replicated, whereas locally the statement
is rolled back entirely. Making the cluster inconsistent.
The patch changes the code so that prior to statement rollback,
pending rows event are removed and the stmt cache reset.
That patch also introduces MTR tests that excercise multirow insert
statements for regular, and streaming replication.
2020-03-21 09:17:28 +02:00
Andrei Elkin
c8ae357341 MDEV-742 XA PREPAREd transaction survive disconnect/server restart
Lifted long standing limitation to the XA of rolling it back at the
transaction's
connection close even if the XA is prepared.

Prepared XA-transaction is made to sustain connection close or server
restart.
The patch consists of

    - binary logging extension to write prepared XA part of
      transaction signified with
      its XID in a new XA_prepare_log_event. The concusion part -
      with Commit or Rollback decision - is logged separately as
      Query_log_event.
      That is in the binlog the XA consists of two separate group of
      events.

      That makes the whole XA possibly interweaving in binlog with
      other XA:s or regular transaction but with no harm to
      replication and data consistency.

      Gtid_log_event receives two more flags to identify which of the
      two XA phases of the transaction it represents. With either flag
      set also XID info is added to the event.

      When binlog is ON on the server XID::formatID is
      constrained to 4 bytes.

    - engines are made aware of the server policy to keep up user
      prepared XA:s so they (Innodb, rocksdb) don't roll them back
      anymore at their disconnect methods.

    - slave applier is refined to cope with two phase logged XA:s
      including parallel modes of execution.

This patch does not address crash-safe logging of the new events which
is being addressed by MDEV-21469.

CORNER CASES: read-only, pure myisam, binlog-*, @@skip_log_bin, etc

Are addressed along the following policies.
1. The read-only at reconnect marks XID to fail for future
   completion with ER_XA_RBROLLBACK.

2. binlog-* filtered XA when it changes engine data is regarded as
   loggable even when nothing got cached for binlog.  An empty
   XA-prepare group is recorded. Consequent Commit-or-Rollback
   succeeds in the Engine(s) as well as recorded into binlog.

3. The same applies to the non-transactional engine XA.

4. @@skip_log_bin=OFF does not record anything at XA-prepare
   (obviously), but the completion event is recorded into binlog to
   admit inconsistency with slave.

The following actions are taken by the patch.

At XA-prepare:
   when empty binlog cache - don't do anything to binlog if RO,
   otherwise write empty XA_prepare (assert(binlog-filter case)).

At Disconnect:
   when Prepared && RO (=> no binlogging was done)
     set Xid_cache_element::error := ER_XA_RBROLLBACK
     *keep* XID in the cache, and rollback the transaction.

At XA-"complete":
   Discover the error, if any don't binlog the "complete",
   return the error to the user.

Kudos
-----
Alexey Botchkov took to drive this work initially.
Sergei Golubchik, Sergei Petrunja, Marko Mäkelä provided a number of
good recommendations.
Sergei Voitovich made a magnificent review and improvements to the code.
They all deserve a bunch of thanks for making this work done!
2020-03-14 22:45:48 +02:00
Alexander Barkov
a1e330de5a MDEV-21743 Split up SUPER privilege to smaller privileges 2020-03-10 23:49:47 +04:00
Sergei Golubchik
cbede21d0d cleanup: pass trxid by value 2020-03-10 19:24:23 +01:00
Sergei Golubchik
c1c5222cae cleanup: PSI key is *always* the first argument 2020-03-10 19:24:23 +01:00
Sergei Golubchik
81cffda2e6 perfschema transaction instrumentation related changes 2020-03-10 19:24:23 +01:00
Sergei Golubchik
22b6d8487a perfschema file instrumentation related changes 2020-03-10 19:24:22 +01:00
Sergei Golubchik
7c58e97bf6 perfschema memory related instrumentation changes 2020-03-10 19:24:22 +01:00
Sergei Golubchik
2ac3121af2 perfschema - various collateral cleanups and small changes 2020-03-10 19:24:22 +01:00
Oleksandr Byelkin
4b087e1754 Merge branch '10.4' into 10.5 2020-02-12 08:55:17 +01:00
Oleksandr Byelkin
646d1ec83a Merge branch '10.3' into 10.4 2020-02-11 14:40:35 +01:00
Marko Mäkelä
5ff66fb0b9 Merge 10.2 into 10.3 2020-01-31 11:37:12 +02:00
Marko Mäkelä
2daf3b14fe Merge 10.1 into 10.2 2020-01-31 10:53:56 +02:00
mkaruza
41bc736871 Galera GTID support
Support for galera GTID consistency thru cluster. All nodes in cluster
should have same GTID for replicated events which are originating from cluster.
Cluster originating commands need to contain sequential WSREP GTID seqno
Ignore manual setting of gtid_seq_no=X.

In master-slave scenario where master is non galera node replicated GTID is
replicated and is preserved in all nodes.

To have this - domain_id, server_id and seqnos should be same on all nodes.
Node which bootstraps the cluster, to achieve this, sends domain_id and
server_id to other nodes and this combination is used to write GTID for events
that are replicated inside cluster.

Cluster nodes that are executing non replicated events are going to have different
GTID than replicated ones, difference will be visible in domain part of gtid.

With wsrep_gtid_domain_id you can set domain_id for WSREP cluster.

Functions WSREP_LAST_WRITTEN_GTID, WSREP_LAST_SEEN_GTID and
WSREP_SYNC_WAIT_UPTO_GTID now works with "native" GTID format.

Fixed galera tests to reflect this chances.

Add variable to manually update WSREP GTID seqno in cluster

Add variable to manipulate and change WSREP GTID seqno. Next command
originating from cluster and on same thread will have set seqno and
cluster should change their internal counter to it's value.
Behavior is same as using @@gtid_seq_no for non WSREP transaction.
2020-01-29 15:06:06 +02:00
Sujatha
d89bb88674 MDEV-20923:UBSAN: member access within address … which does not point to an object of type 'xid_count_per_binlog'
Problem:
-------
Accessing a member within 'xid_count_per_binlog' structure results in
following error when 'UBSAN' is enabled.

member access within address 0xXXX which does not point to an object of type
'xid_count_per_binlog'

Analysis:
---------
The problem appears to be that no constructor for 'xid_count_per_binlog' is
being called, and thus the vtable will not be initialized.

Fix:
---
Defined a parameterized constructor for 'xid_count_per_binlog' class.
2020-01-29 16:33:05 +05:30
Marko Mäkelä
ded128aa9b Merge 10.4 into 10.5 2020-01-20 16:48:56 +02:00
Marko Mäkelä
87a61355e8 Merge 10.3 into 10.4
The MDEV-17062 fix in commit c4195305b2
was omitted.
2020-01-20 15:49:48 +02:00
Sergei Petrunia
e709eb9bf7 Merge branch '10.2' into 10.3
# Conflicts:
#	mysql-test/suite/galera/r/MW-388.result
#	mysql-test/suite/galera/t/MW-388.test
#	mysql-test/suite/innodb/r/truncate_inject.result
#	mysql-test/suite/innodb/t/truncate_inject.test
#	mysql-test/suite/rpl/r/rpl_stop_slave.result
#	mysql-test/suite/rpl/t/rpl_stop_slave.test
#	sql/sp_head.cc
#	sql/sp_head.h
#	sql/sql_lex.cc
#	sql/sql_yacc.yy
#	storage/xtradb/buf/buf0dblwr.cc
2020-01-17 00:46:40 +03:00
Alexander Barkov
497ee33848 MDEV-21497 Make Field_time, Field_datetime, Field_timestamp abstract
- Making classes Field_time, Field_datetime, Field_timestamp abstract
- Adding instantiable Field_time0, Field_datetime0, Field_timestamp0 classes
- Removing redundant cast in field_conv.cc, item_timefunc.cc, sp.cc in calls for set_time() and get_timestamp()
- Replacing store_TIME() to store_timestamp() in log.cc and removing redundant cast
2020-01-16 09:59:39 +04:00
Sujatha
41cde4fe22 MDEV-18514: Assertion `!writer.checksum_len || writer.remains == 0' failed
Analysis:
========
'max_binlog_cache_size' is configured and a huge transaction is executed. When
the transaction specific events size exceeds 'max_binlog_cache_size' the event
cannot be written to the binary log cache and cache write error is raised.
Upon cache write error the statement is rolled back and the transaction cache
should be truncated to a previous statement specific position.  The truncate
operation should reset the cache to earlier valid positions and flush the new
changes. Even though the flush is successful the cache write error is still in
marked state. The truncate code interprets the cache write error as cache flush
failure and returns abruptly without modifying the write cache parameters.
Hence cache is in a invalid state. When a COMMIT statement is executed in this
session it tries to flush the contents of transaction cache to binary log.
Since cache has partial events the cache write operation will report
'writer.remains' assert.

Fix:
===
Binlog truncate function resets the cache to a specified size. As a first step
of truncation, clear the cache write error flag that was raised during earlier
execution. With this new errors that surface during cache truncation can be
clearly identified.
2020-01-09 12:45:05 +05:30
Marko Mäkelä
ae90f8431b Merge 10.4 into 10.5 2019-11-14 14:49:20 +02:00
Sujatha
caa79081c3 MDEV-20707: Missing memory barrier in parallel replication error handler in wait_for_prior_commit()
revision-id: 673e253724979fd9fe43a4a22bd7e1b2c3a5269e
Author: Kristian Nielsen

Fix missing memory barrier in wait_for_commit.

The function wait_for_commit::wait_for_prior_commit() has a fast path where it
checks without locks if wakeup_subsequent_commits() has already been called.
This check was missing a memory barrier. The waitee thread does two writes to
variables `waitee' and `wakeup_error', and if the waiting thread sees the
first write it _must_ also see the second or incorrect behavior will occur.
This requires memory barriers between both the writes (release semantics) and
the reads (acquire semantics) of those two variables.

Other accesses to these variables are done under lock or where only one thread
will be accessing them, and can be done without barriers (relaxed semantics).
2019-11-14 12:03:39 +05:30
Oleksandr Byelkin
3ad37ed0eb Merge 10.4 into 10.5 2019-11-07 08:52:30 +01:00
Marko Mäkelä
ec40980ddd Merge 10.3 into 10.4 2019-11-01 15:23:18 +02:00
Daniele Sciascia
f4ba775914 MDEV-17099 Preliminary changes for Galera XA support (#1404)
Redo changes reverted in commit
8f46e3833c, this time without build
issues in wsrep-lib.
2019-10-30 10:45:22 +02:00
Marko Mäkelä
8f46e3833c Revert MDEV-17099 Preliminary changes for Galera XA support (#1401)
This reverts commit 2b5f4b3ed6
due to build failures.
2019-10-28 16:16:21 +02:00
Daniele Sciascia
2b5f4b3ed6 MDEV-17099 Preliminary changes for Galera XA support (#1401)
Update wsrep-lib, and adapt to wsrep-lib interface changes.
2019-10-24 14:05:32 +03:00
Michael Widenius
716d396bb3 Remove \n from DBUG_PRINT statements 2019-10-21 18:41:58 +03:00
Monty
b62101f84b Fixes for binary logging --read-only mode
- Any temporary tables created under read-only mode will never be logged
  to binary log.  Any usage of these tables to update normal tables, even
  after read-only has been disabled, will use row base logging (as the
  temporary table will not be on the slave).
- Analyze, check and repair table will not be logged in read-only mode.

Other things:
- Removed not used varaibles in
  MYSQL_BIN_LOG::flush_and_set_pending_rows_event.
- Set table_share->table_creation_was_logged for all normal tables.
- THD::binlog_query() now returns -1 if statement was not logged., This
  is used to update table_share->table_creation_was_logged.
- Don't log admin statements in opt_readonly is set.
- Table's that doesn't have table_creation_was_logged will set binlog format to row
  logging.
- Removed not needed/wrong setting of table->s->table_creation_was_logged
  in create_table_from_items()
2019-10-20 11:52:29 +03:00
Alexey Yurchenko
41fa564c88 MDEV-17048 Inconsistency voting support (#1373)
* Collect and pass apply error data to provider
 * Rollback failed transaction and continue operation if provider returns
   SUCCESS
 * MTR tests for inconsistency voting
2019-08-28 09:19:24 +03:00
Sergey Vojtovich
afe969ba05 Removed redundant log_type == LOG_BIN checks 2019-08-22 13:20:30 +04:00
Sergey Vojtovich
6b0b25a25b Cleanup log_type_arg of MYSQL_BIN_LOG::open()
It is always LOG_BIN anyway.
2019-08-22 13:20:30 +04:00
Sergey Vojtovich
e976d95614 Cleanup MYSQL_LOG
Embed MYSQL_LOG::init().
Reduce visibility of MYSQL_LOG::init_and_set_log_file_name().
Cleanup unused mysql_bin_log_file_name() and mysql_bin_log_file_pos().
2019-08-22 13:20:30 +04:00
Alexander Barkov
afe6eb499d Revert "MDEV-20342 Turn Field::flags from a member to a method"
This reverts commit e86010f909.

Reverting on Monty's request, as this change makes merging
things from 10.5 to 10.2 much harder.
2019-08-14 20:27:00 +04:00