Commit graph

2668 commits

Author SHA1 Message Date
Vicențiu Ciorbaru
f177f125d4 Merge branch '5.5' into 10.1 2019-05-11 19:15:57 +03:00
Vicențiu Ciorbaru
15f1e03d46 Follow-up to changing FSF address
Some places didn't match the previous rules, making the Floor
address wrong.

Additional sed rules:

sed -i -e 's/Place.*Suite .*, Boston/Street, Fifth Floor, Boston/g'
sed -i -e 's/Suite .*, Boston/Fifth Floor, Boston/g'
2019-05-11 18:30:45 +03:00
Sergei Golubchik
ffb83ba650 cleanup: move checksum code to handler class
make live checksum to be returned in handler::info(),
and slow table-scan checksum to be calculated in handler::checksum().

part of
MDEV-16249 CHECKSUM TABLE for a spider table is not parallel and saves all data in memory in the spider head by default
2019-05-07 18:40:36 +02:00
Marko Mäkelä
b132b8895e Merge 10.3 into 10.4 2019-05-05 10:23:14 +03:00
Sachin
27980b0f83 MDEV-19365 Assertion failure in LONG Unique after 10.3 merge
If handler->inited is RND use cloned handler for long unique duplicate search.
2019-05-05 10:19:22 +03:00
Marko Mäkelä
4d59f45260 Merge 10.2 into 10.3 2019-04-27 20:41:31 +03:00
Oleksandr Byelkin
4e01bc8c96 MDEV-16240: Assertion `0' failed in row_sel_convert_mysql_key_to_innobase
Set table in row ID position mode before using this function.
2019-04-25 18:02:31 +02:00
Marko Mäkelä
e6bdf77e4b Merge 10.3 into 10.4
In is_eits_usable(), we disable an assertion that fails due to
MDEV-19334.
2019-04-25 16:05:20 +03:00
Sergey Vojtovich
b7fd7ce286 Moved normal transaction xid to implicit_xid
Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
2019-04-25 15:06:40 +04:00
Sergey Vojtovich
a168cfb396 Move XID_state::xa_state handing inside xa.cc
Let xid_cache_insert()/xid_cache_delete() handle xa_state.

Let session tracker use is_explicit_XA() rather than xa_state != XA_NOTR.

Fixed open_tables() to refuse data access in XA_ROLLBACK_ONLY state.

Removed dead code from THD::cleanup(). It was supposed to be a reminder,
but it got messed up over time.

spider_internal_start_trx() is called either with XA_NOTR or XA_ACTIVE,
which is guarded by server callers. Thus is_explicit_XA() is acceptable
replacement for XA_ACTIVE check (which was likely wrong anyway).

Setting xa_state to XA_PREPARED in spider_internal_xa_prepare() isn't
meaningful, as this value is never accessed later. It can't be accessed
by current thread and it can't be recovered either. It can only be
accessed by spider internally, which never happens.

Make spider_xa_lock()/spider_xa_unlock() static.

Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
2019-04-25 15:06:40 +04:00
Sergey Vojtovich
f189f34ed4 Move XID_STATE::rm_error to XID_cache_element
XID_STATE::rm_error is never used by internal 2PC, it is intended to be
used by explicit XA only.

Also removed redundant xid reset from THD::init_for_queries(). Must've
been done already either by THD::transaction constructor or by
THD::cleanup().

Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
2019-04-25 15:06:40 +04:00
Sergey Vojtovich
07140f171d Just move, no code changes otherwise.
Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
2019-04-25 15:06:40 +04:00
Marko Mäkelä
acf6f92aa9 Merge 10.2 into 10.3 2019-04-25 09:05:52 +03:00
Marko Mäkelä
bc145193c1 Merge 10.1 into 10.2 2019-04-25 09:04:09 +03:00
Marko Mäkelä
bfb0726fc2 Merge 5.5 into 10.1 2019-04-24 12:03:11 +03:00
Thirunarayanan Balathandayuthapani
d5da8ae04d MDEV-15772 Potential list overrun during XA recovery
InnoDB could return the same list again and again if the buffer
passed to trx_recover_for_mysql() is smaller than the number of
transactions that InnoDB recovered in XA PREPARE state.

We introduce the transaction state TRX_PREPARED_RECOVERED, which
is like TRX_PREPARED, but will be set during trx_recover_for_mysql()
so that each transaction will only be returned once.

Because init_server_components() is invoking ha_recover() twice,
we must reset the state of the transactions back to TRX_PREPARED
after returning the complete list, so that repeated traversals
will see the complete list again, instead of seeing an empty list.
Without this tweak, the test main.tc_heuristic_recover would hang
in MariaDB 10.1.
2019-04-24 11:46:14 +03:00
Eugene Kosov
3a3d5ba235 MDEV-13301 Optimize DROP INDEX, ADD INDEX into RENAME INDEX
Just rename index in data dictionary and in InnoDB cache when it's possible.
Introduce ALTER_INDEX_RENAME for that purpose so that engines can optimize
such operation.

Unused code between macro MYSQL_RENAME_INDEX was removed.

compare_keys_but_name(): compare index definitions except for index names

Alter_inplace_info::rename_keys:
ha_innobase_inplace_ctx::rename_keys: vector of rename indexes

fill_alter_inplace_info():: fills Alter_inplace_info::rename_keys
2019-04-03 18:36:33 +02:00
Sachin
ba7d33a898 MDEV-18820 Assertion `lock_table_has(trx, index->table, LOCK_IX)' failed in lock_rec_insert_check_and_lock upon INSERT into table with blob key
Don't Ignore Any error during index lookup, And throw duplicate key error
only if error is HA_ERR_FOUND_DUPP_KEY
2019-04-03 12:53:58 +05:30
Marko Mäkelä
5c3ff5cb93 Merge 10.3 into 10.4 2019-04-02 11:04:54 +03:00
Nikita Malyavin
e6230e844c MDEV-15951 system versioning by trx id doesn't work with partitioning
Fix partitioning for trx_id-versioned tables.
`partition by hash`, `range` and others now work.
`partition by system_time` is forbidden.
Currently we cannot use row_start and row_end in `partition by`, because
insertion of versioned field is done by engine's handler, as well as
row_start/row_end's value set up, which is a transaction id -- so it's
also forbidden.

The drawback is that it's now impossible to use `partition by key()`
without parameters for such tables, because it references row_start and
row_end implicitly.

* add handler::vers_can_native()
* drop Table_scope_and_contents_source_st::vers_native()
* drop partition_element::find_engine_flag as unused
* forbid versioning partitioning for trx_id as not supported
* adopt vers tests for trx_id partitioning
* forbid any row_end referencing in `partition by` clauses,
  including implicit `by key()`
2019-03-29 12:51:19 +01:00
Marko Mäkelä
514b305dfb Merge 10.3 into 10.4
The MDEV-17262 commit 26432e49d3
was skipped. In Galera 4, the implementation would seem to require
changes to the streaming replication.

In the tests archive.rnd_pos main.profiling, disable_ps_protocol
for SHOW STATUS and SHOW PROFILE commands until MDEV-18974
has been fixed.
2019-03-20 10:41:32 +02:00
Sergei Golubchik
f38c352172 post-merge: gcc 8 warnings 2019-03-17 13:06:56 +01:00
sachinsetia1001@gmail.com
8995f33c0b MDEV-18889 Long unique on virtual fields crashes server
Use table->record[0] for ha_index_read_map so that vfield gets automatically
be updated.
2019-03-17 13:45:57 +05:30
Marko Mäkelä
2a791c53ad Merge 10.3 into 10.4 2019-03-06 09:00:52 +02:00
Julius Goryavsky
723ffdb32e MDEV-9519: After-merge fix for 10.3 2019-03-05 14:11:42 +01:00
Marko Mäkelä
a2fc36989e Merge 10.2 into 10.3 2019-03-04 17:01:00 +02:00
Teemu Ollakka
a8ff4f243d MDEV-18631 Fix streaming replication with wsrep_gtid_mode=ON
With wsrep_gtid_mode=ON, the appropriate commit hooks were not
called in all cases for applied streaming transactions.

As a fix, removed all special handling of commit order critical
section from Wsrep_high_priority_service and Wsrep_storage_service.
Now commit order critical section is always entered in ha_commit_trans().

Check for wsrep_run_commit_hook is now done in handler.cc, log.cc.
This makes it explicit that the transaction is an active wsrep
transaction which must go through commit hooks.
2019-03-04 09:00:25 +02:00
Jan Lindström
f65f40bb35 Merge remote-tracking branch 'origin/10.1' into 10.2 2019-02-28 13:08:11 +02:00
Jan Lindström
5a87e3ee87 Revert offending part of MDEV-9519: Data corruption will happen on the Galera cluster size change
This will allow test binlog.binlog_stm_binlog to pass more often.
Note that this is not a real fix to that test failure.
2019-02-28 09:29:19 +02:00
Sergei Golubchik
b10340998f MDEV-18748 REPLACE doesn't work with unique blobs on MyISAM table
on long unique conflict, set table->file->dup_ref for
engines that support it
2019-02-27 23:27:43 -05:00
Sergei Golubchik
304f0084ef MDEV-18720 Assertion `inited==NONE' failed in ha_index_init upon update on versioned table with key on blob
* update system versioning fields before generaled columns
* don't presume that ha_write_row() means INSERT. It could still be UPDATE
* use the correct handler in check_duplicate_long_entry_key()
2019-02-27 23:27:43 -05:00
Sergei Golubchik
387b690eab cleanup: cosmetic fixes 2019-02-27 23:15:28 -05:00
Julius Goryavsky
50b3632fa4 MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-26 08:09:04 +02:00
Julius Goryavsky
2c734c980e MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-26 07:45:11 +02:00
Julius Goryavsky
243f829c1c MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-25 11:19:07 +02:00
Oleksandr Byelkin
fb01193ce9 make windows compiler happy 2019-02-22 17:03:14 +01:00
sachin
0372f98cb5 Fix buildbot Windows and bintar compile failure 2019-02-22 12:33:08 +01:00
Sachin
d00f19e832 MDEV-371 Unique Index for long columns
This patch implements engine independent unique hash index.

Usage:- Unique HASH index can be created automatically for blob/varchar/test column whose key
 length > handler->max_key_length()
or it can be explicitly specified.

  Automatic Creation:-
   Create TABLE t1 (a blob unique);
  Explicit Creation:-
   Create TABLE t1 (a int , unique(a) using HASH);

Internal KEY_PART Representations:-
 Long unique key_info will have 2 representations.
 (lets understand this with an example create table t1(a blob, b blob , unique(a, b)); )

 1. User Given Representation:- key_info->key_part array will be similar to what user has defined.
 So in case of example it will have 2 key_parts (a, b)

 2. Storage Engine Representation:- In this case there will be only one key_part and it will point to
 HASH_FIELD. This key_part will be always after user defined key_parts.

 So:- User Given Representation          [a] [b] [hash_key_part]
                  key_info->key_part ----^
  Storage Engine Representation          [a] [b] [hash_key_part]
                  key_info->key_part ------------^

 Table->s->key_info will have User Given Representation, While table->key_info will have Storage Engine
 Representation.Representation can be changed into each other by calling re/setup_keyinfo_hash function.

Working:-

1. So when user specifies HASH_INDEX or key_length is > handler->max_key_length(), In mysql_prepare_create_table
One extra vfield is added (for each long unique key). And key_info->algorithm is set to HA_KEY_ALG_LONG_HASH.

2. In init_from_binary_frm_image values for hash_keypart is set (like fieldnr , field and flags)

3. In parse_vcol_defs, HASH_FIELD->vcol_info is created. Item_func_hash is used with list of Item_fields,
   When Explicit length is given by user then Item_left is used to concatenate Item_field values.

4. In ha_write_row/ha_update_row check_duplicate_long_entry_key is called which will create the hash key from
table->record[0] and then call ha_index_read_map , if we found duplicated hash , we will compare the result
field by field.
2019-02-22 00:35:40 +01:00
Sergei Golubchik
8ad23ff498 don't allow TIME columns in PERIOD specification 2019-02-21 14:57:10 +01:00
Nikita Malyavin
6294516a56 MDEV-16975 Application-time periods: ALTER TABLE
* implicit period constraint is hidden and cannot be dropped independently
* create...like and create...select support
2019-02-21 14:57:09 +01:00
Nikita Malyavin
073c93b194 MDEV-17082 Application-time periods: CREATE
* add syntax `CREATE TABLE ... PERIOD FOR <apptime>`
* add table period entity
2019-02-21 14:48:04 +01:00
Teemu Ollakka
f0b65102b2 MDEV-18585 Avoid excessive Annotate_rows_log_events in binlog
Make sure that the Annotate_rows_log_events is written into
binlog only for the first fragment of the current statement.
Also avoid flusing pending rows event when calculating bytes
generated by the transaction.

Added and recorded a test which verifies that the binlog
contains only one Annotate_rows_log_event per statement
with various SR settings. Recrded mysql-wsrep-features#136
which produced different output with excession log events
suppressed.
2019-02-18 15:04:38 +02:00
Igor Babaev
98d55b1366 Merge branch '10.4' into bb-10.4-mdev16188 2019-02-14 22:07:33 -08:00
Igor Babaev
9e114455a9 MDEV-16188 Post merge fixes:fixed warnings on Windows
Also adjusted some result files after Galina's last patch for ANALYZE.
2019-02-06 15:56:21 -08:00
Marko Mäkelä
e80bcd7f64 Merge 10.3 into 10.4 2019-02-05 12:48:02 +02:00
Marko Mäkelä
ab2458c61f Merge 10.2 into 10.3 2019-02-04 15:12:14 +02:00
Igor Babaev
37deed3f37 Merge branch '10.4' into bb-10.4-mdev16188 2019-02-03 18:41:18 -08:00
Igor Babaev
658128af43 MDEV-16188 Use in-memory PK filters built from range index scans
This patch contains a full implementation of the optimization
that allows to use in-memory rowid / primary filters built for range  
conditions over indexes. In many cases usage of such filters reduce  
the number of disk seeks spent for fetching table rows.

In this implementation the choice of what possible filter to be applied  
(if any) is made purely on cost-based considerations.

This implementation re-achitectured the partial implementation of
the feature pushed by Galina Shalygina in the commit
8d5a11122c.

Besides this patch contains a better implementation of the generic  
handler function handler::multi_range_read_info_const() that
takes into account gaps between ranges when calculating the cost of
range index scans. It also contains some corrections of the
implementation of the handler function records_in_range() for MyISAM.

This patch supports the feature for InnoDB and MyISAM.
2019-02-03 14:56:12 -08:00
Teemu Ollakka
4ea128391b MDEV-15740 Fix wsrep recovery with wsrep_emulate_bin_log
If the TC log did not provide list of XIDs to recover, the
commit by XID was skipped during wsrep recovery if binlog emulation
was on. However, with wsrep we want to commit every prepared transaction
with assigned wsrep XID since the transaction has already been
committed in the cluster.

Added a special condition to always proceed to commit by XID in
xarecover_handlerton() if binlog is off and the recovered transaction
has wsrep XID.
2019-01-27 15:36:36 +02:00
Teemu Ollakka
040b840de7 MDEV-15740 Backport wsrep recovery fixes from 10.4.
Clear wsrep XID in innobase_rollback_by_xid() for recovered wsrep
transaction in order to avoid resetting XID storage when rolling back
wsrep transaction during recovery.

Sort wsrep XIDs read from storage engine in ascending order and
erify that the range is continuous during crash recovery. If binlog is off,
commit all recovered transactions for continuous seqno range. This is safe
because all transactions with wsrep XID have been certified and must be
committed in the cluster. On the other hand if binlog is on, respect binlog
as a transaction coordinator in order to avoid missing transactions in binlog
that have been committed into storage engine .
2019-01-25 16:19:20 +02:00