Commit graph

1242 commits

Author SHA1 Message Date
Marko Mäkelä
e9aaa10c11 Merge 10.2 into 10.3 2020-05-29 22:21:19 +03:00
Sergei Golubchik
e64dc07125 assert(a && b); -> assert(a); assert(b); 2020-05-27 15:56:40 +02:00
Thirunarayanan Balathandayuthapani
f8ec3ba01b MDEV-21832 FORCE all partition to rebuild if any one of the
partition does rebuild

- In ha_innobase::commit_inplace_alter_table() assumes that all partition
should do the same kind of alter operations. During DDL, if one partition
requires table rebuild and other partition doesn't need rebuild
then all partition should be forced to rebuild.
2020-03-30 12:41:59 +03:00
Igor Babaev
cfa0506f8a MDEV-21554 Crash in JOIN_CACHE_BKAH::skip_index_tuple when mrr=on and
join_cache_level=6+

The patch fixes two similar bugs in the commit 8eeb689e9f
that added multi_range_read support to partitions. The commit opened
a possibility to join a partition table using BKA+MRR. However in some
cases it could lead to wrong results or even crashes.

This could happened when
- index condition pushdown was used to join the table or
- the joined table was an inner table of an outer join and 'not exist'
  optimization was applied or
- the join table was the inner table of a semi-join and the first match
  optimization was applied

The bugs were in the code of the call-back functions
- partition_multi_range_key_skip_record() and
- partition_multi_range_key_skip_index_tuple().
Each of this function consist only of an invocation of another function.
Yet a wrong parameter was passed at this invocation.

The fix was suggested by Sergey Petrunia and it is apparently in line
with original design.
The corresponding comprehensive test cases demonstrating the problems
caused by the bugs were constructed by me.
2020-02-25 00:50:23 -08:00
Sergei Petrunia
b3ded21922 ha_partition: add comments, comment out unused member variables 2020-02-05 00:54:16 +03:00
Aleksey Midenkov
d759f764f6 MDEV-21233 Assertion `m_extra_cache' failed in ha_partition::late_extra_cache
Incorrect assertion of EXTRA_CACHE for
HA_EXTRA_PREPARE_FOR_UPDATE. The latter is related to read cache, but
must operate without it as a noop.

Related to Bug#55458 and MDEV-20441.
2019-12-05 15:25:30 +03:00
Faustin Lammler
2df2238cb8 Lintian complains on spelling error
The lintian check complains on spelling error:
https://salsa.debian.org/mariadb-team/mariadb-10.3/-/jobs/95739
2019-12-02 12:41:13 +02:00
Aleksey Midenkov
a7cf0db3d8 MDEV-21011 Table corruption reported for versioned partitioned table after DELETE: "Found a misplaced row"
LIMIT history partitions cannot be checked by existing algorithm of
check_misplaced_rows() because working history partition is
incremented each time another one is filled. The existing algorithm
gets record and tries to decide partition id for it by
get_partition_id(). For LIMIT history it will just get first
non-filled partition.

To fix such partitions it is required to do REBUILD instead of REPAIR.
2019-12-02 11:48:37 +03:00
Aleksey Midenkov
0076dce2c8 MDEV-18727 improve DML operation of System Versioning
MDEV-18957 UPDATE with LIMIT clause is wrong for versioned partitioned tables

UPDATE, DELETE: replace linear search of current/historical records
with vers_setup_conds().

Additional DML cases in view.test
2019-11-22 14:29:03 +03:00
Alexey Botchkov
7b5654f3e9 MDEV-14667 Assertion `used_parts > 0' failed in ha_partition::init_record_priority_queue.
Do not fail fi all the partitions were pruned out.
2019-11-20 00:33:32 +04:00
Sergei Petrunia
86167e908f MDEV-20611: MRR scan over partitioned InnoDB table produces "Out of memory" error
Fix partitioning and DS-MRR to work together

- In ha_partition::index_end(): take into account that ha_innobase (and
  other engines using DS-MRR) will have inited=RND when initialized for
  DS-MRR scan.
- In ha_partition::multi_range_read_next(): if the MRR scan is using
  HA_MRR_NO_ASSOCIATION mode, it is not guaranteed that the partition's
  handler will store anything into *range_info.
- In DsMrr_impl::choose_mrr_impl(): ha_partition will inquire partitions
  about how much memory their MRR implementation needs by passing
  *buffer_size=0. DS-MRR code didn't know about this (actually it used
  uint for buffer size calculation and would have an under-flow).
  Returning *buffer_size=0 made ha_partition assume that partitions do
  not need MRR memory and pass the same buffer to each of them.

  Now, this is fixed. If DS-MRR gets *buffer_size=0, it will return
  the amount of buffer space needed, but not more than about
  @@mrr_buffer_size.

* Fix ha_{innobase,maria,myisam}::clone. If ha_partition uses MRR on its
  partitions, and partition use DS-MRR, the code will call handler->clone
  with TABLE (*NOT partition*) name as an argument.
  DS-MRR has no way of knowing the partition name, so the solution was
  to have the ::clone() function for the affected storage engine to ignore
  the name argument and get it elsewhere.
2019-11-15 23:37:28 +03:00
Alexey Botchkov
6dce6aeceb MDEV-18244 Server crashes in ha_innobase::update_thd / ... / ha_partition::update_next_auto_inc_val.
Partition table with the AUTO_INCREMENT column we ahve to check if the
max value is properly loaded. So we need to open all tables in INSERT
PARTITION statement if necessary. Also we need to check if some
tables are pruned away and not count the max autoincrement in this case.
2019-11-01 09:39:43 +04:00
Monty
938925211a MDEV-19254 Server crashes in maria_status with partitioned table
Bug was that storage_engine::info() was called with not opened table in
ha_partition::info(). Fixed by ensuring that we are using an opened table.
2019-08-19 19:49:45 +03:00
Aleksey Midenkov
a20f6f9853 MDEV-20336 Assertion bitmap_is_set(read_partitions) upon SELECT FOR UPDATE from versioned table
Exclude SELECT and INSERT SELECT from vers_set_hist_part(). We cannot
likewise exclude REPLACE SELECT because it may REPLACE into itself
(and REPLACE generates history).

INSERT also does not generate history, but we have history
modification setting which might be interfered.
2019-08-14 17:32:19 +03:00
Aleksey Midenkov
98758b52b3 MDEV-20068 History partition rotation is not done under LOCK TABLES
Wrong value F_WRLCK for thr_lock_type.
2019-08-11 12:32:08 +03:00
Marko Mäkelä
be85d3e61b Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Oleksandr Byelkin
c51f85f882 Merge branch '10.2' into 10.3 2019-05-12 17:20:23 +02:00
Vicențiu Ciorbaru
cb248f8806 Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru
5543b75550 Update FSF Address
* Update wrong zip-code
2019-05-11 21:29:06 +03:00
Sergei Golubchik
ffb83ba650 cleanup: move checksum code to handler class
make live checksum to be returned in handler::info(),
and slow table-scan checksum to be calculated in handler::checksum().

part of
MDEV-16249 CHECKSUM TABLE for a spider table is not parallel and saves all data in memory in the spider head by default
2019-05-07 18:40:36 +02:00
Oleksandr Byelkin
8cbb14ef5d Merge branch '10.1' into 10.2 2019-05-04 17:04:55 +02:00
Marko Mäkelä
1cd31bc132 Bug#28573894 ALTER PARTITIONED TABLE ADD AUTO_INCREMENT DIFF RESULT DEPENDING ON ALGORITHM
For partitioned table, ensure that the AUTO_INCREMENT values will
be assigned from the same sequence. This is based on the following
change in MySQL 5.6.44:

commit aaba359c13d9200747a609730dafafc3b63cd4d6
Author: Rahul Malik <rahul.m.malik@oracle.com>
Date:   Mon Feb 4 13:31:41 2019 +0530

    Bug#28573894 ALTER PARTITIONED TABLE ADD AUTO_INCREMENT DIFF RESULT DEPENDING ON ALGORITHM

    Problem:
    When a partition table is in-place altered to add an auto-increment column,
    then its values are starting over for each partition.

    Analysis:
    In the case of in-place alter, InnoDB is creating a new sequence object
    for each partition. It is default initialized. So auto-increment columns
    start over for each partition.

    Fix:
    Assign old sequence of the partition to the sequence of next partition
    so it won't start over.

    RB#21148
    Reviewed by Bin Su <bin.x.su@oracle.com>
2019-04-25 14:12:45 +03:00
Sergei Golubchik
f38c352172 post-merge: gcc 8 warnings 2019-03-17 13:06:56 +01:00
Julius Goryavsky
50b3632fa4 MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-26 08:09:04 +02:00
Julius Goryavsky
2c734c980e MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-26 07:45:11 +02:00
Julius Goryavsky
243f829c1c MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-25 11:19:07 +02:00
Nikita Malyavin
6a73569f12 MDEV-16429: Assertion `!table || (!table->read_set || bitmap_is_set(table->read_set, field_index))' fails upon attempt to update virtual column on partitioned versioned table
When using buffered sort in `UPDATE`, keyread is used. In this case,
`TABLE::update_virtual_field` should be aborted, but it actually isn't,
because it is called not with a top-level handler, but with the one that
is actually going to access the disk. Here the problemm is issued with
partitioning, so the solution is to recursively mark for keyread all the
underlying partition handlers.

* ha_partition: update keyread state for child partitions

Closes #800
2018-12-20 08:06:55 +01:00
Aleksey Midenkov
50bc55d462 MDEV-16241 Assertion `inited==RND' failed in handler::ha_rnd_end()
Discrepancy in open indexes due to overwritten `read_partitions` upon
`ha_open()` in `ha_partition::clone()`.

[Fixes tempesta-tech/mariadb#551]
2018-11-13 10:30:27 +01:00
Marko Mäkelä
df563e0c03 Merge 10.2 into 10.3
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.

main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
2018-11-06 09:40:39 +02:00
Marko Mäkelä
32062cc61c Merge 10.1 into 10.2 2018-11-06 08:41:48 +02:00
Sergei Golubchik
44f6f44593 Merge branch '10.0' into 10.1 2018-10-30 15:10:01 +01:00
Sergey Vojtovich
4ac85d6fd7 MDEV-14815 - Server crash or AddressSanitizer errors or valgrind warnings
in thr_lock / has_old_lock upon FLUSH TABLES

Explicit partition access of partitioned MEMORY table under LOCK TABLES
may cause subsequent statements to crash the server, deadlock, trigger
valgrind warnings or ASAN errors. Freed memory was being used due to
incorrect cleanup.

At least MyISAM and InnoDB don't seem to be affected, since their
THR_LOCK structures don't survive FLUSH TABLES. MEMORY keeps table shared
data (including THR_LOCK) even if there're no open instances.

There's partition_info::lock_partitions bitmap, which holds bits of
partitions allowed to be accessed after pruning. This bitmap is
updated for each individual statement.

This bitmap was abused in ha_partition::store_lock() such that when we
need to unlock a table, locked by LOCK TABLES, only locks for partitions
that were accessed by previous statement were released.

Eventually FLUSH TABLES frees THR_LOCK_DATA objects, which are still
linked into THR_LOCK lists. When such THR_LOCK gets reused we end up with
freed memory access.

Fixed by using ha_partition::m_locked_partitions bitmap similarly to
ha_partition::external_lock().
2018-10-19 19:09:48 +04:00
Sergei Golubchik
57e0da50bb Merge branch '10.2' into 10.3 2018-09-28 16:37:06 +02:00
Sergei Golubchik
e7d152293d MDEV-13089 identifier quoting in partitioning
cover ALTER TABLE
2018-09-21 20:22:14 +02:00
Sergei Golubchik
5f654c2e91 comments and dbug keywords 2018-09-21 15:05:55 +02:00
Nikita Malyavin
c16a54c02e MDEV-16429: Assertion `!table || (!table->read_set || bitmap_is_set(table->read_set, field_index))' fails upon attempt to update virtual column on partitioned versioned table
When using buffered sort in `UPDATE`, keyread is used. In this case,
`TABLE::update_virtual_field` should be aborted, but it actually isn't,
because it is called not with a top-level handler, but with the one that
is actually going to access the disk. Here the problemm is issued with
partitioning, so the solution is to recursively mark for keyread all the
underlying partition handlers.

* ha_partition: update keyread state for child partitions

Closes #800
2018-09-21 15:05:54 +02:00
Jacob Mathew
ed49f9aae2 MDEV-16912: Spider Order By column[datatime] limit 5 returns 3 rows
The problem occurs in 10.2 and earlier releases of MariaDB Server because the
Partition Engine was not pushing the engine conditions to the underlying
storage engine of each partition.  This caused Spider to return the first 5
rows in the table with the data provided by the customer.  2 of the 5 rows
did not qualify the WHERE clause, so they were removed from the result set by
the server.

To fix the problem, I have back-ported support for engine condition pushdown
in the Partition Engine from MariaDB Server 10.3 to 10.2 and 10.1.  In 10.3
and 10.4 I have merged the comments and the test case.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.

Merged:
  Commit eb2ca3d on branch bb-10.2-MDEV-16912
2018-09-13 14:55:46 -07:00
Jacob Mathew
6c47c1c456 MDEV-16912: Spider Order By column[datatime] limit 5 returns 3 rows
The problem occurs in 10.2 and earlier releases of MariaDB Server because the
Partition Engine was not pushing the engine conditions to the underlying
storage engine of each partition.  This caused Spider to return the first 5
rows in the table with the data provided by the customer.  2 of the 5 rows
did not qualify the WHERE clause, so they were removed from the result set by
the server.

To fix the problem, I have back-ported support for engine condition pushdown
in the Partition Engine from MariaDB Server 10.3.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.

Cherry-Picked:
  Commit eb2ca3d on branch bb-10.2-MDEV-16912
2018-09-13 13:27:03 -07:00
Jacob Mathew
eb2ca3d445 MDEV-16912: Spider Order By column[datatime] limit 5 returns 3 rows
The problem occurs in 10.2 and earlier releases of MariaDB Server because the
Partition Engine was not pushing the engine conditions to the underlying
storage engine of each partition.  This caused Spider to return the first 5
rows in the table with the data provided by the customer.  2 of the 5 rows
did not qualify the WHERE clause, so they were removed from the result set by
the server.

To fix the problem, I have back-ported support for engine condition pushdown
in the Partition Engine from MariaDB Server 10.3.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.
2018-09-11 16:29:44 -07:00
Jacob Mathew
813b739850 MDEV-16246: insert timestamp into spider table from mysqldump gets wrong time zone.
The problem occurred because the Spider node was incorrectly handling
timestamp values sent to and received from the data nodes.

The problem has been corrected as follows:
- Added logic to set and maintain the UTC time zone on the data nodes.
  To prevent timestamp ambiguity, it is necessary for the data nodes to use
  a time zone such as UTC which does not have daylight savings time.
- Removed the spider_sync_time_zone configuration variable, which did not
  solve the problem and which interfered with the solution.
- Added logic to convert to the UTC time zone all timestamp values sent to
  and received from the data nodes.  This is done for both unique and
  non-unique timestamp columns.  It is done for WHERE clauses, applying to
  SELECT, UPDATE and DELETE statements, and for UPDATE columns.
- Disabled Spider's use of direct update when any of the columns to update is
  a timestamp column.  This is necessary to prevent false duplicate key value
  errors.
- Added a new test spider.timestamp to thoroughly test Spider's handling of
  timestamp values.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.

Cherry-Picked:
  Commit 97cc9d3 on branch bb-10.3-MDEV-16246
2018-07-09 16:09:20 -07:00
Sergei Golubchik
36e59752e7 Merge branch '10.2' into 10.3 2018-06-30 16:39:20 +02:00
Marko Mäkelä
31c950cca8 Merge 10.1 into 10.2 2018-06-26 18:16:49 +03:00
Marko Mäkelä
c6392d52ee Merge 10.0 into 10.1 2018-06-26 17:34:44 +03:00
Andrei Elkin
28e1f1453f MDEV-15242 Poor RBR update performance with partitioned tables
Observed and described
partitioned engine execution time difference
between master and slave was caused by excessive invocation
of base_engine::rnd_init which was done also for partitions
uninvolved into Rows-event operation.
The bug's slave slowdown therefore scales with the number of partitions.

Fixed with applying an upstream patch.

References:
----------
https://bugs.mysql.com/bug.php?id=73648
Bug#25687813 REPLICATION REGRESSION WITH RBR AND PARTITIONED TABLES
2018-06-25 16:45:00 +03:00
Eugene Kosov
aa5683d12e MDEV-15380 Index for versioned table gets corrupt after partitioning and DELETE
In a test case Update occurs between Search and Delete/Update. This corrupts rowid
which Search saves for Delete/Update. Patch prevents this by using of
HA_EXTRA_REMEMBER_POS and HA_EXTRA_RESTORE_POS in a partition code.

This situation possibly occurs only with system versioning table and partition.
MyISAM and Aria engines are affected.

fix by midenok
Closes #705
2018-05-22 13:11:14 +02:00
Sergei Golubchik
b1a6d2826a cleanup: versioning style fixes
rename LString/XString classes, remove unused ones
2018-05-12 10:16:45 +02:00
Sergei Golubchik
0f956a0676 cleanup: hide HA_ERR_RECORD_DELETED in ha_rnd_next()
it's internal storage engine error, don't let it leak
into the upper layer.
2018-05-12 10:16:45 +02:00
Sergei Golubchik
c9717dc019 Merge branch '10.2' into 10.3 2018-05-11 13:15:10 +02:00
Sergei Golubchik
88a0bb83df MDEV-15626 Assertion on update virtual column in partitioned table
table.cc:
  virtual columns must be computed for INSERT, if they're part
  of the partitioning expression.

this change broke gcol.gcol_partition_innodb.
fix CHECK TABLE for partitioned tables and vcols.

sql_partition.cc:
  mark prerequisite base columns in full_part_field_set
ha_partition.cc
  initialize vcol_set accordingly
2018-05-10 12:48:23 +02:00