Commit graph

3031 commits

Author SHA1 Message Date
Marko Mäkelä
c06845d6f0 Merge 10.1 into 10.2 2020-04-27 13:28:13 +03:00
Marko Mäkelä
6be05ceb05 MDEV-22203: WSREP_ON is unnecessarily expensive to evaluate
This is a backport of the applicable part of
commit 93475aff8d and
commit 2c39f69d34
from 10.4.

Before 10.4 and Galera 4, WSREP_ON is a macro that points to
a global Boolean variable, so it is not that expensive to
evaluate, but we will add an unlikely() hint around it.

WSREP_ON_NEW: Remove. This macro was introduced in
commit c863159c32
when reverting WSREP_ON to its previous definition.

We replace some use of WSREP_ON with WSREP(thd), like it was done
in 93475aff8d. Note: the macro
WSREP() in 10.1 is equivalent to WSREP_NNULL() in 10.4.

Item_func_rand::seed_random(): Avoid invoking current_thd
when WSREP is not enabled.
2020-04-27 09:40:51 +03:00
Marko Mäkelä
89698cef72 MDEV-10570: Fix -Wmaybe-uninitialized
Rows_log_event::change_to_flashback_event(): Reduce the scope
of the variable swap_buff2, and do not duplicate conditions.

GCC 9.3.0 flagged the -Wmaybe-uninitialized when compiling the
10.5 branch using cmake -DWITH_ASAN=ON -DCMAKE_CXX_FLAGS=-O2
2020-03-17 10:35:56 +02:00
seppo
4618c974e4
MDEV-21723 Async slave thread BF abort and replaying fixes (#1448)
If async replication slave thread conflicts with cluster replication,
then the async slave transaction should be BF aborted, and depending on the
state of async slave transaction execution, potentially also replayed.
There were problems in such BF abort implementation and the replaying was not
started.
This pull request contains fixes which make sure that if async slave thread is
marked to abort and replay, it will complete carry out the rollback and
release all locks and resources before starting the replaying. After replaying,
async slave transactions is treated as successful, so the slave thread will
continue as usual, handling next replication event.

There is also new mtr test: galera.galera_slave_replay, which stresses both a
certification failure for async slave thread and a successful BF abort
followed by replaying.
2020-02-23 10:29:42 +02:00
Oleksandr Byelkin
f2ccfcaca1 Merge branch '10.1' into 10.2 2020-01-24 13:46:49 +01:00
Sujatha
599a06098b MDEV-21490: binlog tests fail with valgrind: Conditional jump or move depends on uninitialised value in sql_ex_info::init
Problem:
=======
P1) Conditional jump or move depends on uninitialised value(s)
    sql_ex_info::init(char const*, char const*, bool) (log_event.cc:3083)

code: All the following variables are not initialized.
----
  return ((cached_new_format != -1) ? cached_new_format :
    (cached_new_format=(field_term_len > 1 || enclosed_len > 1 ||
    line_term_len > 1 || line_start_len > 1 || escaped_len > 1)));

P2) Conditional jump or move depends on uninitialised value(s)
    Rows_log_event::Rows_log_event(char const*, unsigned
      int, Format_description_log_event const*) (log_event.cc:9571)

Code: Uninitialized values is reported for 'var_header_len' variable.
----
  if (var_header_len < 2 || event_len < static_cast<unsigned
      int>(var_header_len + (post_start - buf)))

P3) Conditional jump or move depends on uninitialised value(s)
    Table_map_log_event::pack_info(Protocol*) (log_event.cc:11553)

code:'m_table_id' is uninitialized.
----
  void Table_map_log_event::pack_info(Protocol *protocol)
  ...
  size_t bytes= my_snprintf(buf, sizeof(buf), "table_id: %lu (%s.%s)",
                              m_table_id, m_dbnam, m_tblnam);

Fix:
===
P1 - Fix)
Initialize cached_new_format,field_term_len, enclosed_len, line_term_len,
line_start_len, escaped_len members in default constructor.

P2 - Fix)
"var_header_len" is initialized by reading the event buffer. In case of an
invalid event the buffer will contain invalid data. Hence added a check to
validate the event data. If event_len is smaller than valid header length
return immediately.

P3 - Fix)
'm_table_id' within Table_map_log_event is initialized by reading data from
the event buffer. Use 'VALIDATE_BYTES_READ' macro to validate the current
state of the buffer. If it is invalid return immediately.
2020-01-24 13:35:03 +05:30
Sujatha
8317f77ccc Merge branch '10.1' into 10.2
MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events

Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following assert when ASAN is enabled.

uint32 binlog_get_uncompress_len(const char*):
  Assertion `(buf[0] & 0xe0) == 0x80' failed

Fix:
===
**Part11: Converted debug assert to error handler code**
2020-01-07 21:29:07 +05:30
Sujatha
cb204e11ea MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error.

AddressSanitizer: heap-buffer-overflow on address
READ of size 1 at 0x60e00009cf71 thread T28
#0 0x55e37e034ae2 in net_field_length

Fix:
===
**Part10: Avoid reading out of buffer**
2020-01-07 18:27:06 +05:30
Sujatha
d05c511d34 MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following assert when ASAN is enabled.

Query_log_event::Query_log_event(const char*, uint,
    const Format_description_log_event*, Log_event_type):
  Assertion `(pos) + (6) <= (end)' failed

Fix:
===
**Part9: Removed additional DBUG_ASSERT**
2020-01-07 18:27:06 +05:30
Sujatha
bac3353361 MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error

AddressSanitizer: SEGV on unknown address
The signal is caused by a READ memory access.
User_var_log_event::User_var_log_event(char const*, unsigned int,
    Format_description_log_event const*)

Implemented part of upstream patch.
commit: mysql/mysql-server@a3a497ccf7

Fix:
===
**Part8: added checks to avoid reading out of buffer limits**
2020-01-07 18:27:05 +05:30
Sujatha
2187f1c2ca MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error
"heap-buffer-overflow on address" and some times it asserts.

Table_map_log_event::Table_map_log_event(const char*, uint,
    const Format_description_log_event*)
Assertion `m_field_metadata_size <= (m_colcnt * 2)' failed.

Fix:
===
**Part7: Avoid reading out of buffer**


Converted debug assert to error handler code.
2020-01-07 18:27:05 +05:30
Sujatha
d6fa69e4be MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error

AddressSanitizer: heap-buffer-overflow on address 0x60400002acb8
Load_log_event::copy_log_event(char const*, unsigned long, int,
    Format_description_log_event const*)

Fix:
===
**Part6: Moved the event_len validation to the begin of copy_log_event function**
2020-01-07 18:27:05 +05:30
Sujatha
15781283eb MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error

AddressSanitizer: heap-buffer-overflow on address
String::append(char const*, unsigned int)
Query_log_event::pack_info(Protocol*)

Fix:
===
**Part5: Added check to catch buffer overflow**
2020-01-07 18:27:05 +05:30
Sujatha
a42ef10815 MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following ASAN error

heap-buffer-overflow within "my_strndup" in Rotate_log_event

my_strndup /mysys/my_malloc.c:254
Rotate_log_event::Rotate_log_event(char const*, unsigned int,
    Format_description_log_event const*)

Fix:
===
**Part4: Improved the check for event_len validation**
2020-01-07 18:27:05 +05:30
Sujatha
5a54e84e5d MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following crash when ASAN is enabled.

SEGV on unknown address
in inline_mysql_mutex_destroy
in my_bitmap_free
in Update_rows_log_event::~Update_rows_log_event()

Fix:
===
**Part3: Initialize m_cols_ai.bitmap to NULL**
2020-01-07 18:27:05 +05:30
Sujatha
913f405d95 MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following assert when ASAN is enabled.

Rows_log_event::Rows_log_event(const char*, uint,
    const Format_description_log_event*):
Assertion `var_header_len >= 2'

Implemented part of upstream patch.
commit: mysql/mysql-server@a3a497ccf7

Fix:
===
**Part2: Avoid reading out of buffer limits**
2020-01-07 18:27:05 +05:30
Sachin Setiya
27664ef29d MDEV-20574 Position of events reported by mysqlbinlog is wrong with encrypted binlogs, SHOW BINLOG EVENTS reports the correct one.
Analysis

Mysqlbinlog output for encrypted binary log
#Q> insert into tab1 values (3,'row 003')
#190912 17:36:35 server id 10221  end_log_pos 980 CRC32 0x53bcb3d3  Table_map: `test`.`tab1` mapped to number 19
# at 940
#190912 17:36:35 server id 10221  end_log_pos 1026 CRC32 0xf2ae5136     Write_rows: table id 19 flags: STMT_END_F

Here we can see Table_map_log_event ends at 980 but Next event starts at 940.
And the reason for that is we do not send START_ENCRYPTION_EVENT to the slave

Solution:-
Send Start_encryption_log_event as Ignorable_log_event to slave(mysqlbinlog),
So that mysqlbinlog can update its log_pos.
Since Slave can request multiple FORMAT_DESCRIPTION_EVENT while master does not
have so We only update slave master pos when master actually have the
FORMAT_DESCRIPTION_EVENT. Similar logic should be applied for START_ENCRYPTION_EVENT.

Also added the test case when new server reads the data from old server which
does not send START_ENCRYPTION_EVENT to slave.

Master Slave Upgrade Scenario.
When Slave is updated first, Slave will have extra logic of handling
START_ENCRYPTION_EVENT But master willnot be sending START_ENCRYPTION_EVENT.
So there will be no issue.
When Master is updated first, It will send  START_ENCRYPTION_EVENT to
slave , But slave will ignore this event in queue_event.
2019-10-08 14:35:34 +05:30
Marko Mäkelä
6962855185 Merge 10.1 into 10.2 2019-07-18 13:10:09 +03:00
Sujatha
10ebdb7f1d MDEV-11154: Write_on_release_cache(log_event.cc) function will not write "COMMIT", if use "mysqlbinlog ... | mysql ..."
Problem:
=======
Executing command, "mysqlbinlog --read-from-remote-server --host='xx.xx.xx.xx'
--port=3306 --user=xxx --password=xxx --database=mysql --to-last-log
mysql-bin.000001 --start-position=1098699 --stop-never |mysql -uxxx -pxxx", we
found that last data read from remote couldn't commit.

Analysis:
========
The purpose of 'Write_on_release_cache' is that the contents of the Cache will
automatically be written to a dedicated result file on destruction. Flush
operation on the result file is controlled by a flag 'FLUSH_F'. Events which
require force flush upon their destruction will have to enable this
'Write_on_release_cache::FLUSH_F'. At present the 'FLUSH_F' flag is defined as
an enum as shown below.

enum flag
{
  FLUSH_F
};

Since 'FLUSH_F' is the first member without initialization it get the default
value '0'. Because of this the following flush condition never succeeds.

if (m_flags & FLUSH_F)
  fflush(m_file);

At present the file gets flushed only during my_fclose(result_file) operation.
When continuous streaming is enabled through --stop-never option it never gets
flushed and hence events are not replicated.

Fix:
===
Initialize the enum value to non zero value.
2019-07-15 13:30:10 +05:30
Sujatha
5a2110e7cf MDEV-19076: rpl_parallel_temptable result mismatch '-33 optimistic'
Problem:
========
The test now fails with the following trace:

CURRENT_TEST: rpl.rpl_parallel_temptable
--- /mariadb/10.4/mysql-test/suite/rpl/r/rpl_parallel_temptable.result
+++ /mariadb/10.4/mysql-test/suite/rpl/r/rpl_parallel_temptable.reject
@@ -194,7 +194,6 @@
 30    conservative
 31    conservative
 32    optimistic
-33    optimistic

Analysis:
=========
The part of test which fails with result content mismatch is given below.

CREATE TEMPORARY TABLE t4 (a INT PRIMARY KEY) ENGINE=InnoDB;
INSERT INTO t4 VALUES (32);
INSERT INTO t4 VALUES (33);
INSERT INTO t1 SELECT a, "optimistic" FROM t4;

slave_parallel_mode=optimistic

The expectation of the above test script is, INSERT FROM SELECT should read both
32, 33 and populate table 't1'. But this expectation fails occasionally.

All three INSERT statements are handed over to three different slave parallel
workers. Temporary tables are not safe for parallel replication. They were
designed to be visible to one thread only, so have no table locking.  Thus there
is no protection against two conflicting transactions committing in parallel and
things like that.

So anything that uses temporary tables will be serialized with anything before
it, when using parallel replication by using a "wait_for_prior_commit" function
call. This will ensure that the each transaction is executed sequentially.

But there exists a code path in which the above wait doesn't happen.  Because of
this at times INSERT from SELECT doesn't wait for the INSERT (33) to complete
and it completes its executes and enters commit stage.  Hence only row 32 is
found in those cases resulting in test failure.

The wait needs to be added within "open_temporary_table" call. The code looks
like this within "open_temporary_table".

Each thread tries to open temporary table in 3 different ways:

case 1: Find a temporary table which is already in use by using
         find_temporary_table(tl) && wait_for_prior_commit()
case 2: If above failed then try to look for temporary table which is marked for
        free for reuse. This internally calls "wait_for_prior_commit()" if table
        is found.
         find_and_use_tmp_table(tl, &table)
case 3: If none of the above open a new table handle from table share.
         if (!table && (share= find_tmp_table_share(tl)))
         { table= open_temporary_table(share, tl->get_table_name(), true); }

At present the "wait_for_prior_commit" happens only in case 1 & 2.

Fix:
====
On slave add a call for "wait_for_prior_commit" for case 3.

The above wait on slave will solve the issue. A more detailed fix would be to
mark temporary tables as not safe for parallel execution on the master side.
In order to do that, on the master side, mark the Gtid_log_event specific flag
FL_TRANSACTIONAL to be false all the time. So that they are not scheduled
parallely.
2019-05-20 15:46:26 +05:30
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru
cb248f8806 Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru
5543b75550 Update FSF Address
* Update wrong zip-code
2019-05-11 21:29:06 +03:00
Oleksandr Byelkin
8cbb14ef5d Merge branch '10.1' into 10.2 2019-05-04 17:04:55 +02:00
Sergei Golubchik
2ce52790ff Merge branch '5.5' into 10.1 2019-04-26 14:02:37 +02:00
Venkatesh Venugopal
ae1b8b9bf5 Problem
-------
MySQL abnormally exits on KILL command.

Fix
---
The abnormal exit has been fixed.

RB: 20971, 21129, 21237
2019-04-25 18:03:00 +02:00
Oleksandr Byelkin
4e01bc8c96 MDEV-16240: Assertion `0' failed in row_sel_convert_mysql_key_to_innobase
Set table in row ID position mode before using this function.
2019-04-25 18:02:31 +02:00
Marko Mäkelä
9835f7b80f Merge 10.1 into 10.2 2019-03-04 16:46:58 +02:00
Alexander Barkov
19df45a705 MDEV-18333 Slow_queries count doesn't increase when slow_query_log is turned off 2019-03-04 13:49:15 +04:00
Marko Mäkelä
081fd8bfa2 Merge 10.1 into 10.2 2019-02-02 11:40:02 +02:00
Oleksandr Byelkin
a3a4ea9355 postmerge rollbacks and fixes 2019-01-31 19:28:38 +01:00
Oleksandr Byelkin
560799ebd8 Merge branch '10.0-galera' into 10.1 2019-01-31 09:34:34 +01:00
Andrei Elkin
ef0b91ea94 MDEV-17803: ulonglongization of table_mapping entry::table_id to fix windows compilation in particular. 2019-01-25 13:42:27 +02:00
Andrei Elkin
5d48ea7d07 MDEV-10963 Fragmented BINLOG query
The problem was originally stated in
  http://bugs.mysql.com/bug.php?id=82212
The size of an base64-encoded Rows_log_event exceeds its
vanilla byte representation in 4/3 times.
When a binlogged event size is about 1GB mysqlbinlog generates
a BINLOG query that can't be send out due to its size.

It is fixed with fragmenting the BINLOG argument C-string into
(approximate) halves when the base64 encoded event is over 1GB size.
The mysqlbinlog in such case puts out

    SET @binlog_fragment_0='base64-encoded-fragment_0';
    SET @binlog_fragment_1='base64-encoded-fragment_1';
    BINLOG @binlog_fragment_0, @binlog_fragment_1;

to represent a big BINLOG.
For prompt memory release BINLOG handler is made to reset the BINLOG argument
user variables in the middle of processing, as if @binlog_fragment_{0,1} = NULL
is assigned.

Notice the 2 fragments are enough, though the client and server still may
need to tweak their @@max_allowed_packet to satisfy to the fragment
size (which they would have to do anyway with greater number of
fragments, should that be desired).

On the lower level the following changes are made:

Log_event::print_base64()
  remains to call encoder and store the encoded data into a cache but
  now *without* doing any formatting. The latter is left for time
  when the cache is copied to an output file (e.g mysqlbinlog output).
  No formatting behavior is also reflected by the change in the meaning
  of the last argument which specifies whether to cache the encoded data.

Rows_log_event::print_helper()
  is made to invoke a specialized fragmented cache-to-file copying function
  which is

copy_cache_to_file_wrapped()
  that takes care of fragmenting also optionally wraps encoded
  strings (fragments) into SQL stanzas.

my_b_copy_to_file()
  is refactored to into my_b_copy_all_to_file(). The former function
  is generalized
  to accepts more a limit argument to constraint the copying and does
  not reinitialize anymore the cache into reading mode.
  The limit does not do any effect on the fully read cache.
2019-01-24 20:44:50 +02:00
Marko Mäkelä
fab531a150 Fix the build after MDEV-17803
Use the same data type 'ulong' to avoid type mismatch on Windows
and on 32-bit systems.

FIXME: The correct data type should probably be 64-bit.
2019-01-24 15:59:00 +02:00
Marko Mäkelä
32062cc61c Merge 10.1 into 10.2 2018-11-06 08:41:48 +02:00
Sergei Golubchik
44f6f44593 Merge branch '10.0' into 10.1 2018-10-30 15:10:01 +01:00
Jan Lindström
b0fe082b36 Merge remote-tracking branch 'origin/5.5-galera' into 10.0-galera 2018-10-30 13:22:52 +02:00
Jan Lindström
2ee9343c87 Merge tag 'mariadb-5.5.62' into 5.5-galera 2018-10-29 18:45:19 +02:00
Sergei Golubchik
37ab7e4596 Merge branch '5.5' into 10.0 2018-10-27 20:46:38 +02:00
Sachin
e31e697f17 MDEV-15919 lower_case_table_names does not behave as expected(nor...
consistently) on Replication Slave

lower_case_table_names  0 -> 1 replication works, it's safe as long as
 mixed case names mapping to the lower case ones is one-to-one
2018-10-17 10:46:20 +05:30
Kristian Nielsen
2f4a0c5be2 Fix accumulation of old rows in mysql.gtid_slave_pos
This would happen especially in optimistic parallel replication, where there
is a good chance that a transaction will be rolled back (due to conflicts)
after it has executed record_gtid(). If the transaction did any deletions of
old rows as part of record_gtid(), those deletions will be undone as well.
And the code did not properly ensure that the deletions would be re-tried.

This patch makes record_gtid() remember the list of deletions done as part
of a transaction. Then in rpl_slave_state::update() when the changes have
been committed, we discard the list. However, in case of error and rollback,
in cleanup_context() we will instead put the list back into
rpl_global_gtid_slave_state so that the deletions will be re-tried later.

Probably fixes part of the cause of MDEV-12147 as well.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2018-10-07 18:59:52 +02:00
Oleksandr Byelkin
28f08d3753 Merge branch '10.1' into 10.2 2018-09-14 08:47:22 +02:00
Sergei Golubchik
db947b7599 Merge branch '10.0-galera' into 10.1 2018-09-07 15:25:27 +02:00
Jan Lindström
c5a8583b31 Merge tag 'mariadb-10.0.36' into 10.0-galera 2018-08-02 11:44:02 +03:00
Jan Lindström
c863159c32 MDEV-16799: Test wsrep.variables crash at sql_class.cc:639 thd_get_ha_data
Problem was that binlog_hton was not initialized fully when needed
i.e. when wsrep_on = true.
2018-07-24 14:54:50 +03:00
Marko Mäkelä
31c950cca8 Merge 10.1 into 10.2 2018-06-26 18:16:49 +03:00
Marko Mäkelä
c6392d52ee Merge 10.0 into 10.1 2018-06-26 17:34:44 +03:00
Andrei Elkin
28e1f1453f MDEV-15242 Poor RBR update performance with partitioned tables
Observed and described
partitioned engine execution time difference
between master and slave was caused by excessive invocation
of base_engine::rnd_init which was done also for partitions
uninvolved into Rows-event operation.
The bug's slave slowdown therefore scales with the number of partitions.

Fixed with applying an upstream patch.

References:
----------
https://bugs.mysql.com/bug.php?id=73648
Bug#25687813 REPLICATION REGRESSION WITH RBR AND PARTITIONED TABLES
2018-06-25 16:45:00 +03:00
Sergei Golubchik
b942aa34c1 Merge branch '10.1' into 10.2 2018-06-21 23:47:39 +02:00