Commit graph

1401 commits

Author SHA1 Message Date
Monty
f36ca142f7 Added page_range to records_in_range() to improve range statistics
Prototype change:
-  virtual ha_rows records_in_range(uint inx, key_range *min_key,
-                                   key_range *max_key)
+  virtual ha_rows records_in_range(uint inx, const key_range *min_key,
+                                   const key_range *max_key,
+                                   page_range *res)

The handler can ignore the page_range parameter. In the case the handler
updates the parameter, the optimizer can deduce the following:
- If previous range's last key is on the same block as next range's first
  key
- If the current key range is in one block
- We can also assume that the first and last block read are cached!
  This can be used for a better calculation of IO seeks when we
  estimate the cost of a range index scan.

The parameter is fully implemented for MyISAM, Aria and InnoDB.
A separate patch will update handler::multi_range_read_info_const() to
take the benefits of this change and also remove the double
records_in_range() calls that are not anymore needed.
2020-03-27 03:54:45 +02:00
Vladislav Vaintroub
6ef3dbb1ff Fix unused variable warning in optimized build. 2020-03-25 15:53:38 +01:00
Monty
37393bea23 Replace handler::primary_key_is_clustered() with handler::pk_is_clustering_key()
This was done to both simplify the code and also to be easier to handle
storage engines that are clustered on some other index than the primary
key.

As pk_is_clustering_key() and is_clustering_key now are using only
index_flags, these where removed from all storage engines.
2020-03-24 21:00:04 +02:00
Monty
91ab42a823 Clean up and speed up interfaces for binary row logging
MDEV-21605 Clean up and speed up interfaces for binary row logging
MDEV-21617 Bug fix for previous version of this code

The intention is to have as few 'if' as possible in ha_write() and
related functions. This is done by pre-calculating once per statement the
row_logging state for all tables.

Benefits are simpler and faster code both when binary logging is disabled
and when it's enabled.

Changes:
- Added handler->row_logging to make it easy to check it table should be
  row logged. This also made it easier to disabling row logging for system,
  internal and temporary tables.
- The tables row_logging capabilities are checked once per "statements
  that updates tables" in THD::binlog_prepare_for_row_logging() which
  is called when needed from THD::decide_logging_format().
- Removed most usage of tmp_disable_binlog(), reenable_binlog() and
  temporary saving and setting of thd->variables.option_bits.
- Moved checks that can't change during a statement from
  check_table_binlog_row_based() to check_table_binlog_row_based_internal()
- Removed flag row_already_logged (used by sequence engine)
- Moved binlog_log_row() to a handler::
- Moved write_locked_table_maps() to THD::binlog_write_table_maps() as
  most other related binlog functions are in THD.
- Removed binlog_write_table_map() and binlog_log_row_internal() as
  they are now obsolete as 'has_transactions()' is pre-calculated in
  prepare_for_row_logging().
- Remove 'is_transactional' argument from binlog_write_table_map() as this
  can now be read from handler.
- Changed order of 'if's in handler::external_lock() and wsrep_mysqld.h
  to first evaluate fast and likely cases before more complex ones.
- Added error checking in ha_write_row() and related functions if
  binlog_log_row() failed.
- Don't clear check_table_binlog_row_based_result in
  clear_cached_table_binlog_row_based_flag() as it's not needed.
- THD::clear_binlog_table_maps() has been replaced with
  THD::reset_binlog_for_next_statement()
- Added 'MYSQL_OPEN_IGNORE_LOGGING_FORMAT' flag to open_and_lock_tables()
  to avoid calculating of binary log format for internal opens. This flag
  is also used to avoid reading statistics tables for internal tables.
- Added OPTION_BINLOG_LOG_OFF as a simple way to turn of binlog temporary
  for create (instead of using THD::sql_log_bin_off.
- Removed flag THD::sql_log_bin_off (not needed anymore)
- Speed up THD::decide_logging_format() by remembering if blackhole engine
  is used and avoid a loop over all tables if it's not used
  (the common case).
- THD::decide_logging_format() is not called anymore if no tables are used
  for the statement. This will speed up pure stored procedure code with
  about 5%+ according to some simple tests.
- We now get annotated events on slave if a CREATE ... SELECT statement
  is transformed on the slave from statement to row logging.
- In the original code, the master could come into a state where row
  logging is enforced for all future events if statement could be used.
  This is now partly fixed.

Other changes:
- Ensure that all tables used by a statement has query_id set.
- Had to restore the row_logging flag for not used tables in
  THD::binlog_write_table_maps (not normal scenario)
- Removed injector::transaction::use_table(server_id_type sid, table tbl)
  as it's not used.
- Cleaned up set_slave_thread_options()
- Some more DBUG_ENTER/DBUG_RETURN, code comments and minor indentation
  changes.
- Ensure we only call THD::decide_logging_format_low() once in
  mysql_insert() (inefficiency).
- Don't annotate INSERT DELAYED
- Removed zeroing pos_in_table_list in THD::open_temporary_table() as it's
  already 0
2020-03-24 21:00:03 +02:00
Monty
4ef437558a Improve update handler (long unique keys on blobs)
MDEV-21606 Improve update handler (long unique keys on blobs)
MDEV-21470 MyISAM and Aria start_bulk_insert doesn't work with long unique
MDEV-21606 Bug fix for previous version of this code
MDEV-21819 2 Assertion `inited == NONE || update_handler != this'

- Move update_handler from TABLE to handler
- Move out initialization of update handler from ha_write_row() to
  prepare_for_insert()
- Fixed that INSERT DELAYED works with update handler
- Give an error if using long unique with an autoincrement column
- Added handler function to check if table has long unique hash indexes
- Disable write cache in MyISAM and Aria when using update_handler as
  if cache is used, the row will not be inserted until end of statement
  and update_handler would not find conflicting rows.
- Removed not used handler argument from
  check_duplicate_long_entries_update()
- Syntax cleanups
  - Indentation fixes
  - Don't use single character indentifiers for arguments
2020-03-24 21:00:02 +02:00
Sergey Vojtovich
da82e75901 handler::rebind()
- rename PFS specific rebind_psi() to generic rebind()
- call rebind independently of PFS compilation status
- allow rebind() return an error
2020-03-24 20:47:41 +02:00
Oleksandr Byelkin
fad47df995 Merge branch '10.4' into 10.5 2020-03-11 17:52:49 +01:00
Oleksandr Byelkin
b8c0e49670 Merge commit '10.3' into 10.4 2020-03-11 13:27:10 +01:00
Sergei Golubchik
c1c5222cae cleanup: PSI key is *always* the first argument 2020-03-10 19:24:23 +01:00
Sergei Golubchik
7af733a5a2 perfschema compilation, test and misc fixes 2020-03-10 19:24:23 +01:00
Sergei Golubchik
7c58e97bf6 perfschema memory related instrumentation changes 2020-03-10 19:24:22 +01:00
Thirunarayanan Balathandayuthapani
f56dd0a12d MDEV-21693 ALGORITHM=INSTANT does not work for partitioned tables
- Flag ALTER_STORED_COLUMN_TYPE set while doing varchar extension
for partition table. Basically all partition supports
can_be_converted_by_engine() then it should be set to
ALTER_COLUMN_TYPE_CHANGE_BY_ENGINE.
2020-02-28 14:29:05 +05:30
Igor Babaev
cfa0506f8a MDEV-21554 Crash in JOIN_CACHE_BKAH::skip_index_tuple when mrr=on and
join_cache_level=6+

The patch fixes two similar bugs in the commit 8eeb689e9f
that added multi_range_read support to partitions. The commit opened
a possibility to join a partition table using BKA+MRR. However in some
cases it could lead to wrong results or even crashes.

This could happened when
- index condition pushdown was used to join the table or
- the joined table was an inner table of an outer join and 'not exist'
  optimization was applied or
- the join table was the inner table of a semi-join and the first match
  optimization was applied

The bugs were in the code of the call-back functions
- partition_multi_range_key_skip_record() and
- partition_multi_range_key_skip_index_tuple().
Each of this function consist only of an invocation of another function.
Yet a wrong parameter was passed at this invocation.

The fix was suggested by Sergey Petrunia and it is apparently in line
with original design.
The corresponding comprehensive test cases demonstrating the problems
caused by the bugs were constructed by me.
2020-02-25 00:50:23 -08:00
Oleksandr Byelkin
4b087e1754 Merge branch '10.4' into 10.5 2020-02-12 08:55:17 +01:00
Oleksandr Byelkin
646d1ec83a Merge branch '10.3' into 10.4 2020-02-11 14:40:35 +01:00
Sergei Petrunia
b3ded21922 ha_partition: add comments, comment out unused member variables 2020-02-05 00:54:16 +03:00
Alexander Barkov
f1e13fdc8d MDEV-21581 Helper functions and methods for CHARSET_INFO 2020-01-28 12:29:23 +04:00
Marko Mäkelä
28c89b7151 Merge 10.4 into 10.5 2019-12-16 07:47:17 +02:00
Oleksandr Byelkin
a15234bf4b Merge branch '10.3' into 10.4 2019-12-09 15:09:41 +01:00
Aleksey Midenkov
d759f764f6 MDEV-21233 Assertion `m_extra_cache' failed in ha_partition::late_extra_cache
Incorrect assertion of EXTRA_CACHE for
HA_EXTRA_PREPARE_FOR_UPDATE. The latter is related to read cache, but
must operate without it as a noop.

Related to Bug#55458 and MDEV-20441.
2019-12-05 15:25:30 +03:00
Faustin Lammler
2df2238cb8 Lintian complains on spelling error
The lintian check complains on spelling error:
https://salsa.debian.org/mariadb-team/mariadb-10.3/-/jobs/95739
2019-12-02 12:41:13 +02:00
Aleksey Midenkov
8ed646f071 Merge 10.4 into 10.5 2019-12-02 13:35:54 +03:00
Aleksey Midenkov
0b8b11b0b1 Merge 10.3 into 10.4 2019-12-02 12:51:53 +03:00
Aleksey Midenkov
a7cf0db3d8 MDEV-21011 Table corruption reported for versioned partitioned table after DELETE: "Found a misplaced row"
LIMIT history partitions cannot be checked by existing algorithm of
check_misplaced_rows() because working history partition is
incremented each time another one is filled. The existing algorithm
gets record and tries to decide partition id for it by
get_partition_id(). For LIMIT history it will just get first
non-filled partition.

To fix such partitions it is required to do REBUILD instead of REPAIR.
2019-12-02 11:48:37 +03:00
Kentoku
e066723a41 MDEV-18973 CLIENT_FOUND_ROWS wrong in spider
Get count from last_used_con->info
Contributed by willhan at Tencent Games
2019-11-29 23:23:57 +09:00
Aleksey Midenkov
0c05a2ed71 Merge 10.4 into 10.5 2019-11-25 17:24:09 +03:00
Aleksey Midenkov
33f55789d3 MDEV-18727 improve DML operation of System Versioning (10.4)
UPDATE, DELETE: replace linear search of current/historical records
with vers_setup_conds().

Additional DML cases in view.test
2019-11-25 16:01:43 +03:00
Aleksey Midenkov
0076dce2c8 MDEV-18727 improve DML operation of System Versioning
MDEV-18957 UPDATE with LIMIT clause is wrong for versioned partitioned tables

UPDATE, DELETE: replace linear search of current/historical records
with vers_setup_conds().

Additional DML cases in view.test
2019-11-22 14:29:03 +03:00
Marko Mäkelä
5b686af2ec Merge 10.4 into 10.5 2019-11-20 15:47:16 +02:00
Marko Mäkelä
1c92406337 Merge 10.3 into 10.4 2019-11-20 14:11:54 +08:00
Alexey Botchkov
7b5654f3e9 MDEV-14667 Assertion `used_parts > 0' failed in ha_partition::init_record_priority_queue.
Do not fail fi all the partitions were pruned out.
2019-11-20 00:33:32 +04:00
Marko Mäkelä
a9846f3299 Merge 10.4 into 10.5 2019-11-19 10:45:28 +08:00
Marko Mäkelä
589a1235b6 Merge 10.3 into 10.4 2019-11-19 01:32:50 +02:00
Sergei Petrunia
86167e908f MDEV-20611: MRR scan over partitioned InnoDB table produces "Out of memory" error
Fix partitioning and DS-MRR to work together

- In ha_partition::index_end(): take into account that ha_innobase (and
  other engines using DS-MRR) will have inited=RND when initialized for
  DS-MRR scan.
- In ha_partition::multi_range_read_next(): if the MRR scan is using
  HA_MRR_NO_ASSOCIATION mode, it is not guaranteed that the partition's
  handler will store anything into *range_info.
- In DsMrr_impl::choose_mrr_impl(): ha_partition will inquire partitions
  about how much memory their MRR implementation needs by passing
  *buffer_size=0. DS-MRR code didn't know about this (actually it used
  uint for buffer size calculation and would have an under-flow).
  Returning *buffer_size=0 made ha_partition assume that partitions do
  not need MRR memory and pass the same buffer to each of them.

  Now, this is fixed. If DS-MRR gets *buffer_size=0, it will return
  the amount of buffer space needed, but not more than about
  @@mrr_buffer_size.

* Fix ha_{innobase,maria,myisam}::clone. If ha_partition uses MRR on its
  partitions, and partition use DS-MRR, the code will call handler->clone
  with TABLE (*NOT partition*) name as an argument.
  DS-MRR has no way of knowing the partition name, so the solution was
  to have the ::clone() function for the affected storage engine to ignore
  the name argument and get it elsewhere.
2019-11-15 23:37:28 +03:00
Oleksandr Byelkin
3ad37ed0eb Merge 10.4 into 10.5 2019-11-07 08:52:30 +01:00
Marko Mäkelä
ec40980ddd Merge 10.3 into 10.4 2019-11-01 15:23:18 +02:00
Alexey Botchkov
6dce6aeceb MDEV-18244 Server crashes in ha_innobase::update_thd / ... / ha_partition::update_next_auto_inc_val.
Partition table with the AUTO_INCREMENT column we ahve to check if the
max value is properly loaded. So we need to open all tables in INSERT
PARTITION statement if necessary. Also we need to check if some
tables are pruned away and not count the max autoincrement in this case.
2019-11-01 09:39:43 +04:00
Marko Mäkelä
780d2bb8a7 Merge 10.4 into 10.5 2019-09-06 14:25:20 +03:00
Monty
97dd057702 Fixed issues when running mtr with --valgrind
- Note that some issues was also fixed in 10.2 and 10.4. I also fixed them
  here to be able to continue with making 10.5 valgrind safe again
- Disable connection threads warnings when doing shutdown
2019-08-23 22:03:54 +02:00
Marko Mäkelä
efb8485d85 Merge 10.3 into 10.4, except for MDEV-20265
The MDEV-20265 commit e746f451d5
introduces DBUG_ASSERT(right_op == r_tbl) in
st_select_lex::add_cross_joined_table(), and that assertion would
fail in several tests that exercise joins. That commit was skipped
in this merge, and a separate fix of MDEV-20265 will be necessary in 10.4.
2019-08-23 08:06:17 +03:00
Monty
938925211a MDEV-19254 Server crashes in maria_status with partitioned table
Bug was that storage_engine::info() was called with not opened table in
ha_partition::info(). Fixed by ensuring that we are using an opened table.
2019-08-19 19:49:45 +03:00
Marko Mäkelä
67ddb6507d Merge 10.4 into 10.5 2019-08-16 14:35:32 +03:00
Marko Mäkelä
1d15a28e52 Merge 10.3 into 10.4 2019-08-14 18:06:51 +03:00
Aleksey Midenkov
a20f6f9853 MDEV-20336 Assertion bitmap_is_set(read_partitions) upon SELECT FOR UPDATE from versioned table
Exclude SELECT and INSERT SELECT from vers_set_hist_part(). We cannot
likewise exclude REPLACE SELECT because it may REPLACE into itself
(and REPLACE generates history).

INSERT also does not generate history, but we have history
modification setting which might be interfered.
2019-08-14 17:32:19 +03:00
Marko Mäkelä
624dd71b94 Merge 10.4 into 10.5 2019-08-13 18:57:00 +03:00
Aleksey Midenkov
98758b52b3 MDEV-20068 History partition rotation is not done under LOCK TABLES
Wrong value F_WRLCK for thr_lock_type.
2019-08-11 12:32:08 +03:00
Eugene Kosov
c9aa495fb6 MDEV-19955 make argument of handler::ha_write_row() const
MDEV-19486 and one more similar bug appeared because handler::write_row() interface
welcomes to modify buffer by storage engine. But callers are not ready for that
thus bugs are possible in future.

handler::write_row():
handler::ha_write_row(): make argument const
2019-07-05 13:14:19 +03:00
Marko Mäkelä
984d7100cd Merge 10.4 into 10.5 2019-06-13 18:36:09 +03:00
Sergei Golubchik
27fcdb161c MDEV-16249 CHECKSUM TABLE for a spider table is not parallel and saves all data in memory in the spider head by default (#1328)
followup for be5c432a42

ha_partition::calculate_checksum() has to invoke calculate_checksum()
for partitions unconditionally, not under (HA_HAS_OLD_CHECKSUM | HA_HAS_NEW_CHECKSUM).
Because the server uses ::info() to ask for a live checksum, while
calculate_checksum() must, precisely, calculate it the slow way,
also for tables that don't have the live checksum at all.

Also, fix the compilation on Windows (ha_checksum/ulonglong type mix).
2019-06-11 18:42:45 +02:00
Kentoku SHIBA
be5c432a42
MDEV-16249 CHECKSUM TABLE for a spider table is not parallel and saves all data in memory in the spider head by default (#1328)
add checksum_null for setting null value of checksum
2019-06-11 00:25:08 +09:00
Robert Bindar
bf70430ead MDEV-17709 Remove handlerton::state 2019-06-06 22:09:31 +04:00
Alexander Barkov
d1d6fe9abf Using more of Sql_mode_save. Adding a similar class for THD::abort_on_warnings. 2019-05-28 10:26:08 +04:00
Marko Mäkelä
826f9d4f7e Merge 10.4 into 10.5 2019-05-23 10:32:21 +03:00
Monty
043a3a0176 Avoid not needed renames in ALTER TABLE
Removed not needed table renames when doing ALTER TABLE when engine
changes and both of the following is true:
- Either new or old engine does not store the table in files
- Neither old or new engine uses files from another engine

We also skip renames when ALTER TABLE does an explicit rename

This improves performance, especially for engines where rename is
a slow operation (like the upcoming S3 engine)
2019-05-23 01:20:18 +03:00
Monty
007f68c37f Replace ha_notify_table_changed() with notify_tabledef_changed()
Reason for the change was that ha_notify_table_changed() was done
after table open when .frm had been replaced, which caused failure
in engines that checks on open if .frm matches the engines table
definition.

Other changes:
- Remove not needed open/close call at end of inline alter table.
  Some test that depended on the table beeing in the table cache after
  ALTER TABLE had to be updated.
2019-05-23 01:20:18 +03:00
Oleksandr Byelkin
c07325f932 Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
Marko Mäkelä
be85d3e61b Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Oleksandr Byelkin
c51f85f882 Merge branch '10.2' into 10.3 2019-05-12 17:20:23 +02:00
Vicențiu Ciorbaru
cb248f8806 Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru
5543b75550 Update FSF Address
* Update wrong zip-code
2019-05-11 21:29:06 +03:00
Sergei Golubchik
ffb83ba650 cleanup: move checksum code to handler class
make live checksum to be returned in handler::info(),
and slow table-scan checksum to be calculated in handler::checksum().

part of
MDEV-16249 CHECKSUM TABLE for a spider table is not parallel and saves all data in memory in the spider head by default
2019-05-07 18:40:36 +02:00
Oleksandr Byelkin
8cbb14ef5d Merge branch '10.1' into 10.2 2019-05-04 17:04:55 +02:00
Marko Mäkelä
1cd31bc132 Bug#28573894 ALTER PARTITIONED TABLE ADD AUTO_INCREMENT DIFF RESULT DEPENDING ON ALGORITHM
For partitioned table, ensure that the AUTO_INCREMENT values will
be assigned from the same sequence. This is based on the following
change in MySQL 5.6.44:

commit aaba359c13d9200747a609730dafafc3b63cd4d6
Author: Rahul Malik <rahul.m.malik@oracle.com>
Date:   Mon Feb 4 13:31:41 2019 +0530

    Bug#28573894 ALTER PARTITIONED TABLE ADD AUTO_INCREMENT DIFF RESULT DEPENDING ON ALGORITHM

    Problem:
    When a partition table is in-place altered to add an auto-increment column,
    then its values are starting over for each partition.

    Analysis:
    In the case of in-place alter, InnoDB is creating a new sequence object
    for each partition. It is default initialized. So auto-increment columns
    start over for each partition.

    Fix:
    Assign old sequence of the partition to the sequence of next partition
    so it won't start over.

    RB#21148
    Reviewed by Bin Su <bin.x.su@oracle.com>
2019-04-25 14:12:45 +03:00
Sergey Vojtovich
5d8ca98997 Get rid of rea_create_table()
Moved rea_create_table() to the sole caller.

Also ha_create_partitioning_metadata(CHF_CREATE_FLAG) does cleanup on
error now.

Part of MDEV-17805 - Remove InnoDB cache for temporary tables.
2019-04-03 17:43:12 +04:00
Sachin
901e3ddf79 MDEV-18904 Assertion `m_part_spec.start_part >= m_part_spec.end_part' failed in ha_partition::index_read_idx_map
Remove the DBUG_ASSERT
2019-03-23 18:26:50 +05:30
Marko Mäkelä
514b305dfb Merge 10.3 into 10.4
The MDEV-17262 commit 26432e49d3
was skipped. In Galera 4, the implementation would seem to require
changes to the streaming replication.

In the tests archive.rnd_pos main.profiling, disable_ps_protocol
for SHOW STATUS and SHOW PROFILE commands until MDEV-18974
has been fixed.
2019-03-20 10:41:32 +02:00
Sergei Golubchik
f38c352172 post-merge: gcc 8 warnings 2019-03-17 13:06:56 +01:00
Marko Mäkelä
2a791c53ad Merge 10.3 into 10.4 2019-03-06 09:00:52 +02:00
Julius Goryavsky
50b3632fa4 MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-26 08:09:04 +02:00
Julius Goryavsky
2c734c980e MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-26 07:45:11 +02:00
Julius Goryavsky
243f829c1c MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-25 11:19:07 +02:00
Igor Babaev
e1de23b8d5 MDEV-16188 Introduced the notion of adjusted filter gain.
Due to inconsistent usage of different cost models to calculate
the cost of ref accesses we have to make the calculation of the
gain promising by usage a range filter more complex.
2019-02-14 00:17:20 -08:00
Igor Babaev
cd00d03fe2 MDEV-16188 Fixed the code of ha_partition::multi_range_read_info()
The code was rewritten in the same way as the code of
ha_partition::multi_range_read_info_const() had been rewritten
earlier.

The fix allowed to run spider.partition_mrr.
2019-02-10 21:15:48 -08:00
Igor Babaev
37deed3f37 Merge branch '10.4' into bb-10.4-mdev16188 2019-02-03 18:41:18 -08:00
Igor Babaev
658128af43 MDEV-16188 Use in-memory PK filters built from range index scans
This patch contains a full implementation of the optimization
that allows to use in-memory rowid / primary filters built for range  
conditions over indexes. In many cases usage of such filters reduce  
the number of disk seeks spent for fetching table rows.

In this implementation the choice of what possible filter to be applied  
(if any) is made purely on cost-based considerations.

This implementation re-achitectured the partial implementation of
the feature pushed by Galina Shalygina in the commit
8d5a11122c.

Besides this patch contains a better implementation of the generic  
handler function handler::multi_range_read_info_const() that
takes into account gaps between ranges when calculating the cost of
range index scans. It also contains some corrections of the
implementation of the handler function records_in_range() for MyISAM.

This patch supports the feature for InnoDB and MyISAM.
2019-02-03 14:56:12 -08:00
Monty
c0c62196ca Removed \n from sql_print_error() 2019-01-26 19:18:22 +02:00
Nikita Malyavin
6a73569f12 MDEV-16429: Assertion `!table || (!table->read_set || bitmap_is_set(table->read_set, field_index))' fails upon attempt to update virtual column on partitioned versioned table
When using buffered sort in `UPDATE`, keyread is used. In this case,
`TABLE::update_virtual_field` should be aborted, but it actually isn't,
because it is called not with a top-level handler, but with the one that
is actually going to access the disk. Here the problemm is issued with
partitioning, so the solution is to recursively mark for keyread all the
underlying partition handlers.

* ha_partition: update keyread state for child partitions

Closes #800
2018-12-20 08:06:55 +01:00
Sergei Golubchik
1823ce7304 cleanup: rename a flag, keep old name for compatibility 2018-11-20 15:05:52 +01:00
Marko Mäkelä
dde2ca4aa1 Merge 10.3 into 10.4 2018-11-19 20:22:33 +02:00
Aleksey Midenkov
50bc55d462 MDEV-16241 Assertion `inited==RND' failed in handler::ha_rnd_end()
Discrepancy in open indexes due to overwritten `read_partitions` upon
`ha_open()` in `ha_partition::clone()`.

[Fixes tempesta-tech/mariadb#551]
2018-11-13 10:30:27 +01:00
Marko Mäkelä
074c684099 Merge 10.3 into 10.4 2018-11-06 16:24:16 +02:00
Marko Mäkelä
df563e0c03 Merge 10.2 into 10.3
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.

main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
2018-11-06 09:40:39 +02:00
Marko Mäkelä
32062cc61c Merge 10.1 into 10.2 2018-11-06 08:41:48 +02:00
Sergei Golubchik
44f6f44593 Merge branch '10.0' into 10.1 2018-10-30 15:10:01 +01:00
Sergey Vojtovich
4ac85d6fd7 MDEV-14815 - Server crash or AddressSanitizer errors or valgrind warnings
in thr_lock / has_old_lock upon FLUSH TABLES

Explicit partition access of partitioned MEMORY table under LOCK TABLES
may cause subsequent statements to crash the server, deadlock, trigger
valgrind warnings or ASAN errors. Freed memory was being used due to
incorrect cleanup.

At least MyISAM and InnoDB don't seem to be affected, since their
THR_LOCK structures don't survive FLUSH TABLES. MEMORY keeps table shared
data (including THR_LOCK) even if there're no open instances.

There's partition_info::lock_partitions bitmap, which holds bits of
partitions allowed to be accessed after pruning. This bitmap is
updated for each individual statement.

This bitmap was abused in ha_partition::store_lock() such that when we
need to unlock a table, locked by LOCK TABLES, only locks for partitions
that were accessed by previous statement were released.

Eventually FLUSH TABLES frees THR_LOCK_DATA objects, which are still
linked into THR_LOCK lists. When such THR_LOCK gets reused we end up with
freed memory access.

Fixed by using ha_partition::m_locked_partitions bitmap similarly to
ha_partition::external_lock().
2018-10-19 19:09:48 +04:00
Marko Mäkelä
444c380ceb Merge 10.3 into 10.4 2018-10-05 08:09:49 +03:00
Sergei Golubchik
57e0da50bb Merge branch '10.2' into 10.3 2018-09-28 16:37:06 +02:00
Sergei Golubchik
e7d152293d MDEV-13089 identifier quoting in partitioning
cover ALTER TABLE
2018-09-21 20:22:14 +02:00
Sergei Golubchik
5f654c2e91 comments and dbug keywords 2018-09-21 15:05:55 +02:00
Nikita Malyavin
c16a54c02e MDEV-16429: Assertion `!table || (!table->read_set || bitmap_is_set(table->read_set, field_index))' fails upon attempt to update virtual column on partitioned versioned table
When using buffered sort in `UPDATE`, keyread is used. In this case,
`TABLE::update_virtual_field` should be aborted, but it actually isn't,
because it is called not with a top-level handler, but with the one that
is actually going to access the disk. Here the problemm is issued with
partitioning, so the solution is to recursively mark for keyread all the
underlying partition handlers.

* ha_partition: update keyread state for child partitions

Closes #800
2018-09-21 15:05:54 +02:00
Jacob Mathew
30d225698f MDEV-16912: Spider Order By column[datatime] limit 5 returns 3 rows
The problem occurs in 10.2 and earlier releases of MariaDB Server because the
Partition Engine was not pushing the engine conditions to the underlying
storage engine of each partition.  This caused Spider to return the first 5
rows in the table with the data provided by the customer.  2 of the 5 rows
did not qualify the WHERE clause, so they were removed from the result set by
the server.

To fix the problem, I have back-ported support for engine condition pushdown
in the Partition Engine from MariaDB Server 10.3 to 10.2 and 10.1.  In 10.3
and 10.4 I have merged the comments and the test case.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.

Cherry-Picked:
  Commit ed49f9a on branch 10.3
2018-09-13 16:07:02 -07:00
Jacob Mathew
ed49f9aae2 MDEV-16912: Spider Order By column[datatime] limit 5 returns 3 rows
The problem occurs in 10.2 and earlier releases of MariaDB Server because the
Partition Engine was not pushing the engine conditions to the underlying
storage engine of each partition.  This caused Spider to return the first 5
rows in the table with the data provided by the customer.  2 of the 5 rows
did not qualify the WHERE clause, so they were removed from the result set by
the server.

To fix the problem, I have back-ported support for engine condition pushdown
in the Partition Engine from MariaDB Server 10.3 to 10.2 and 10.1.  In 10.3
and 10.4 I have merged the comments and the test case.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.

Merged:
  Commit eb2ca3d on branch bb-10.2-MDEV-16912
2018-09-13 14:55:46 -07:00
Jacob Mathew
6c47c1c456 MDEV-16912: Spider Order By column[datatime] limit 5 returns 3 rows
The problem occurs in 10.2 and earlier releases of MariaDB Server because the
Partition Engine was not pushing the engine conditions to the underlying
storage engine of each partition.  This caused Spider to return the first 5
rows in the table with the data provided by the customer.  2 of the 5 rows
did not qualify the WHERE clause, so they were removed from the result set by
the server.

To fix the problem, I have back-ported support for engine condition pushdown
in the Partition Engine from MariaDB Server 10.3.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.

Cherry-Picked:
  Commit eb2ca3d on branch bb-10.2-MDEV-16912
2018-09-13 13:27:03 -07:00
Jacob Mathew
eb2ca3d445 MDEV-16912: Spider Order By column[datatime] limit 5 returns 3 rows
The problem occurs in 10.2 and earlier releases of MariaDB Server because the
Partition Engine was not pushing the engine conditions to the underlying
storage engine of each partition.  This caused Spider to return the first 5
rows in the table with the data provided by the customer.  2 of the 5 rows
did not qualify the WHERE clause, so they were removed from the result set by
the server.

To fix the problem, I have back-ported support for engine condition pushdown
in the Partition Engine from MariaDB Server 10.3.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.
2018-09-11 16:29:44 -07:00
Jacob Mathew
d6594847cf MDEV-16246: insert timestamp into spider table from mysqldump gets wrong time zone.
The problem occurred because the Spider node was incorrectly handling
timestamp values sent to and received from the data nodes.

The problem has been corrected as follows:
- Added logic to set and maintain the UTC time zone on the data nodes.
  To prevent timestamp ambiguity, it is necessary for the data nodes to use
  a time zone such as UTC which does not have daylight savings time.
- Removed the spider_sync_time_zone configuration variable, which did not
  solve the problem and which interfered with the solution.
- Added logic to convert to the UTC time zone all timestamp values sent to
  and received from the data nodes.  This is done for both unique and
  non-unique timestamp columns.  It is done for WHERE clauses, applying to
  SELECT, UPDATE and DELETE statements, and for UPDATE columns.
- Disabled Spider's use of direct update when any of the columns to update is
  a timestamp column.  This is necessary to prevent false duplicate key value
  errors.
- Added a new test spider.timestamp to thoroughly test Spider's handling of
  timestamp values.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.

Merged:
  Commit 97cc9d3 on branch bb-10.3-MDEV-16246
2018-07-24 15:57:13 -07:00
Jacob Mathew
813b739850 MDEV-16246: insert timestamp into spider table from mysqldump gets wrong time zone.
The problem occurred because the Spider node was incorrectly handling
timestamp values sent to and received from the data nodes.

The problem has been corrected as follows:
- Added logic to set and maintain the UTC time zone on the data nodes.
  To prevent timestamp ambiguity, it is necessary for the data nodes to use
  a time zone such as UTC which does not have daylight savings time.
- Removed the spider_sync_time_zone configuration variable, which did not
  solve the problem and which interfered with the solution.
- Added logic to convert to the UTC time zone all timestamp values sent to
  and received from the data nodes.  This is done for both unique and
  non-unique timestamp columns.  It is done for WHERE clauses, applying to
  SELECT, UPDATE and DELETE statements, and for UPDATE columns.
- Disabled Spider's use of direct update when any of the columns to update is
  a timestamp column.  This is necessary to prevent false duplicate key value
  errors.
- Added a new test spider.timestamp to thoroughly test Spider's handling of
  timestamp values.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.

Cherry-Picked:
  Commit 97cc9d3 on branch bb-10.3-MDEV-16246
2018-07-09 16:09:20 -07:00
Alexander Barkov
e61568ee93 Merge remote-tracking branch 'origin/10.3' into 10.4 2018-07-03 14:02:05 +04:00
Sergei Golubchik
36e59752e7 Merge branch '10.2' into 10.3 2018-06-30 16:39:20 +02:00
Marko Mäkelä
31c950cca8 Merge 10.1 into 10.2 2018-06-26 18:16:49 +03:00
Marko Mäkelä
c6392d52ee Merge 10.0 into 10.1 2018-06-26 17:34:44 +03:00
Andrei Elkin
28e1f1453f MDEV-15242 Poor RBR update performance with partitioned tables
Observed and described
partitioned engine execution time difference
between master and slave was caused by excessive invocation
of base_engine::rnd_init which was done also for partitions
uninvolved into Rows-event operation.
The bug's slave slowdown therefore scales with the number of partitions.

Fixed with applying an upstream patch.

References:
----------
https://bugs.mysql.com/bug.php?id=73648
Bug#25687813 REPLICATION REGRESSION WITH RBR AND PARTITIONED TABLES
2018-06-25 16:45:00 +03:00
Sergei Golubchik
45dee3fc83 cleanup: remove TABLE::vcol_set
use a read_set instead. a bit in the read_set means "the field
value is needed" (read or generated, whatever it takes).
2018-06-04 12:32:23 +02:00
Sergei Golubchik
c9061d1102 mysys: rename ME_xxx flags to match plugin api 2018-06-04 12:32:23 +02:00
Eugene Kosov
aa5683d12e MDEV-15380 Index for versioned table gets corrupt after partitioning and DELETE
In a test case Update occurs between Search and Delete/Update. This corrupts rowid
which Search saves for Delete/Update. Patch prevents this by using of
HA_EXTRA_REMEMBER_POS and HA_EXTRA_RESTORE_POS in a partition code.

This situation possibly occurs only with system versioning table and partition.
MyISAM and Aria engines are affected.

fix by midenok
Closes #705
2018-05-22 13:11:14 +02:00
Sergei Golubchik
b1a6d2826a cleanup: versioning style fixes
rename LString/XString classes, remove unused ones
2018-05-12 10:16:45 +02:00
Sergei Golubchik
0f956a0676 cleanup: hide HA_ERR_RECORD_DELETED in ha_rnd_next()
it's internal storage engine error, don't let it leak
into the upper layer.
2018-05-12 10:16:45 +02:00
Sergei Golubchik
c9717dc019 Merge branch '10.2' into 10.3 2018-05-11 13:15:10 +02:00
Sergei Golubchik
88a0bb83df MDEV-15626 Assertion on update virtual column in partitioned table
table.cc:
  virtual columns must be computed for INSERT, if they're part
  of the partitioning expression.

this change broke gcol.gcol_partition_innodb.
fix CHECK TABLE for partitioned tables and vcols.

sql_partition.cc:
  mark prerequisite base columns in full_part_field_set
ha_partition.cc
  initialize vcol_set accordingly
2018-05-10 12:48:23 +02:00
Michael Widenius
062a3176e7 Remove mem_alloc_error()
As thd->alloc() and new automatically calls my_error(ER_OUTOFMEORY)
there is no reason to call mem_alloc_error()

Other things:
- Fixed bug in mysql_unpack_partition() where lex.part_info was
  changed even if it would be a null pointer
2018-05-07 00:07:33 +03:00
Monty
30ebc3ee9e Add likely/unlikely to speed up execution
Added to:
- if (error)
- Lex
- sql_yacc.yy and sql_yacc_ora.yy
- In header files to alloc() calls
- Added thd argument to thd_net_is_killed()
2018-05-07 00:07:32 +03:00
Alexey Botchkov
4968049799 MDEV-11084 Select statement with partition selection against MyISAM
table opens all partitions.

Not-used partitions are not closed now.
2018-04-28 15:16:45 +04:00
Monty
ab1941266c Move alter partition flags to alter_info->partition_flags
This is done to get more free flag bits for alter_info->flags

Renamed all ALTER PARTITION defines to start with ALTER_PARTITION_
Renamed ALTER_PARTITION to ALTER_PARTITION_INFO
Renamed ALTER_TABLE_REORG to ALTER_PARTITION_TABLE_REORG

Other things:
- Shifted some ALTER_xxx defines to get empty bits at end
2018-03-29 13:59:41 +03:00
Monty
2dbeebdb16 Changed static const in Alter_info and Alter_online_info to defines
Main reason was to make it easier to print the above structures in
a debugger. Additional benefits is that I was able to use same
defines for both structures, which simplifes some code.

Most of the code is just removing Alter_info:: and Alter_inplace_info::
from alter table flags.

Following renames was done:
HA_ALTER_FLAGS        -> alter_table_operations
CHANGE_CREATE_OPTION  -> ALTER_CHANGE_CREATE_OPTION
Alter_info::ADD_INDEX -> ALTER_ADD_INDEX
DROP_INDEX            -> ALTER_DROP_INDEX
ADD_UNIQUE_INDEX      -> ALTER_ADD_UNIQUE_INDEX
DROP_UNIQUE_INDEx     -> ALTER_DROP_UNIQUE_INDEX
ADD_PK_INDEX          -> ALTER_ADD_PK_INDEX
DROP_PK_INDEX         -> ALTER_DROP_PK_INDEX
Alter_info:ALTER_ADD_COLUMN    -> ALTER_PARSE_ADD_COLUMN
Alter_info:ALTER_DROP_COLUMN   -> ALTER_PARSE_DROP_COLUMN
Alter_inplace_info::ADD_INDEX  -> ALTER_ADD_NON_UNIQUE_NON_PRIM_INDEX
Alter_inplace_info::DROP_INDEX -> ALTER_DROP_NON_UNIQUE_NON_PRIM_INDEX

Other things:
- Added typedef alter_table_operatons for alter table flags
- DROP CHECK CONSTRAINT can now be done online
- Added checks for Aria tables in alter_table_online.test
- alter_table_flags now takes an ulonglong as argument.
- Don't support online operations if checksum option is used.
- sql_lex.cc doesn't add ALTER_ADD_INDEX if index is not created
2018-03-29 13:59:40 +03:00
Sergei Golubchik
8f9c64000e MDEV-15336 Server crashes in handler::print_error / ha_partition::print_error upon query timeout
set m_last_part to something meaningful when opening partitions
2018-02-24 01:28:51 +01:00
Sergei Golubchik
e36c5ec0a5 PARTITION BY SYSTEM_TIME INTERVAL ...
Lots of changes:
* calculate the current history partition in ::external_lock(),
  not in ::write_row() or ::update_row()
* remove dynamically collected per-partition row_end stats
* no full table scan in open_table_from_share to calculate these
  stats, no manual MDL/thr_locks in open_table_from_share
* no shared stats in TABLE_SHARE = no mutexes or condition waits when
  calculating current history partition
* always compare timestamps, don't convert them to MYSQL_TIME
  (avoid DST ambiguity, and it's faster too)
* correct interval handling, 1 month = 1 month, not 30 * 24 * 3600 seconds
* save/restore first partition start time, and count intervals from there
* only allow to drop first partitions if INTERVAL
* when adding new history partitions, split the data in the last history
  parition, if it was overflowed
* show partition boundaries in INFORMATION_SCHEMA.PARTITIONS
2018-02-23 19:17:48 +01:00
Sergei Golubchik
c4c81a5b04 cleanup: partition_info::check_constants
partition_info had a bunch of function pointers to avoid if()'s
when invoking part_type specific functionality (like get_part_id, etc).
But check_range_constants() and check_list_constants() were still
invoked conditionally, with if()'s.

Create partition_info::check_constants function pointer, get rid
of if()'s

Also remove alloc argument of check_range_constants(), added
in 26a3ff0a22. Broken system versioning will be fixed in
following commits.
2018-02-23 15:33:23 +01:00
Aleksey Midenkov
df0e1817c7 Vers SQL: partition rotation by INTERVAL fix
Update partition stats on ha_partition::write_row()
2018-02-23 15:33:22 +01:00
Sergei Golubchik
9f6a7ed2d7 SQL: Truncate history of partitioned table [fixes #399, closes #403]
also, don't rotate versioning partitions for DELETE HISTORY

originally by: Aleksey Midenkov
2018-02-23 15:33:21 +01:00
Sergei Golubchik
187a163c78 cleanup: ha_partition::update_row/delete_row
implement log-term TODO item, convert redundant if()-s into asserts.
2018-02-23 15:33:21 +01:00
Marko Mäkelä
b006d2ead4 Merge bb-10.2-ext into 10.3 2018-02-15 10:22:03 +02:00
Alexey Botchkov
ba125eca55 MDEV-15215 main.partition_explicit_prune fails in bulidbot with assertion failures and server crashes.
ha_partition::get_open_file_sample() implemented to be used
        when we need a file sample that is surely opened.
2018-02-13 18:06:15 +04:00
Monty
54db0be3be Added Max_index_length and Temporary to SHOW TABLE STATUS
- Max_index_length is supported by MyISAM and Aria tables.
- Temporary is a placeholder to signal that a table is a
  temporary table. For the moment this is always "N", except
  "Y" for generated information_schema tables and NULL for
  views. Full temporary table support will be done in another task.
  (No reason to have to update a lot of result files twice in a row)
2018-02-12 17:17:26 +02:00
Vladislav Vaintroub
6c279ad6a7 MDEV-15091 : Windows, 64bit: reenable and fix warning C4267 (conversion from 'size_t' to 'type', possible loss of data)
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.

This fix excludes rocksdb, spider,spider, sphinx and connect for now.
2018-02-06 12:55:58 +00:00
Alexander Barkov
217fc122c8 Merge remote-tracking branch 'origin/bb-10.2-ext' into 10.3 2018-02-04 18:40:06 +04:00
Monty
d69642dedd Added name to MEM_ROOT for esier debugging
This will make it easier to how memory allocation is done when debugging
with either DBUG or gdb.

Will especially help when debugging stored procedures

Main change is a name argument as second argument to init_alloc_root()
init_sql_alloc()

Other things:
- Added DBUG_ENTER/EXIT to some Virtual_tmp_table functions
2018-02-02 11:08:36 +02:00
Monty
a7e352b54d Changed database, tablename and alias to be LEX_CSTRING
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db

Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
  for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
  correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
  handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
  NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
2018-01-30 21:33:55 +02:00
Marko Mäkelä
921c5e9314 Merge bb-10.2-ext into 10.3
MDEV-11415 Remove excessive undo logging during ALTER TABLE…ALGORITHM=COPY

Move a test from innodb.rename_table_debug to innodb.alter_copy.

ha_innobase::extra(HA_EXTRA_BEGIN_ALTER_COPY): Register id-versioned
tables so that mysql.transaction_registry will be updated, even for
empty tables that are subjected to ALTER TABLE…ALGORITHM=COPY.
2018-01-30 21:26:53 +02:00
Marko Mäkelä
0c1f220611 Merge 10.2 into bb-10.2-ext 2018-01-30 20:47:12 +02:00
Marko Mäkelä
0ba6aaf030 MDEV-11415 Remove excessive undo logging during ALTER TABLE…ALGORITHM=COPY
If a crash occurs during ALTER TABLE…ALGORITHM=COPY, InnoDB would spend
a lot of time rolling back writes to the intermediate copy of the table.
To reduce the amount of busy work done, a work-around was introduced in
commit fd069e2bb3 in MySQL 4.1.8 and 5.0.2,
to commit the transaction after every 10,000 inserted rows.

A proper fix would have been to disable the undo logging altogether and
to simply drop the intermediate copy of the table on subsequent server
startup. This is what happens in MariaDB 10.3 with MDEV-14717,MDEV-14585.
In MariaDB 10.2, the intermediate copy of the table would be left behind
with a name starting with the string #sql.

This is a backport of a bug fix from MySQL 8.0.0 to MariaDB,
contributed by jixianliang <271365745@qq.com>.

Unlike recent MySQL, MariaDB supports ALTER IGNORE. For that operation
InnoDB must for now keep the undo logging enabled, so that the latest
row can be rolled back in case of an error.

In Galera cluster, the LOAD DATA statement will retain the existing
behaviour and commit the transaction after every 10,000 rows if
the parameter wsrep_load_data_splitting=ON is set. The logic to do
so (the wsrep_load_data_split() function and the call
handler::extra(HA_EXTRA_FAKE_START_STMT)) are joint work
by Ji Xianliang and Marko Mäkelä.

The original fix:

Author: Thirunarayanan Balathandayuthapani <thirunarayanan.balathandayuth@oracle.com>
Date:   Wed Dec 2 16:09:15 2015 +0530

Bug#17479594 AVOID INTERMEDIATE COMMIT WHILE DOING ALTER TABLE ALGORITHM=COPY

Problem:

During ALTER TABLE, we commit and restart the transaction for every
10,000 rows, so that the rollback after recovery would not take so long.

Fix:

Suppress the undo logging during copy alter operation. If fts_index is
present then insert directly into fts auxiliary table rather
than doing at commit time.

ha_innobase::num_write_row: Remove the variable.

ha_innobase::write_row(): Remove the hack for committing every 10000 rows.

row_lock_table_for_mysql(): Remove the extra 2 parameters.

lock_get_src_table(), lock_is_table_exclusive(): Remove.

Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
Reviewed-by: Shaohua Wang <shaohua.wang@oracle.com>
Reviewed-by: Jon Olav Hauglid <jon.hauglid@oracle.com>
2018-01-30 20:24:23 +02:00
Vladislav Vaintroub
9bb93b86c2 Fix warnings. 2018-01-29 17:01:58 +00:00
Alexey Botchkov
b4a2baffa8 MDEV-11084 Select statement with partition selection against MyISAM table opens all partitions.
Now we don't open partitions if it was explicitly cpecified.
        ha_partition::m_opened_partition bitmap added to track
        partitions that were actually opened.
2018-01-29 11:01:14 +04:00
Aleksey Midenkov
c59c1a0736 System Versioning 1.0 pre8
Merge branch '10.3' into trunk
2018-01-10 12:36:55 +03:00
Marko Mäkelä
145ae15a33 Merge bb-10.2-ext into 10.3 2018-01-04 09:22:59 +02:00
Monty
fbab79c9b8 Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext
Conflicts:
	cmake/make_dist.cmake.in
	mysql-test/r/func_json.result
	mysql-test/r/ps.result
	mysql-test/t/func_json.test
	mysql-test/t/ps.test
	sql/item_cmpfunc.h
2018-01-01 19:39:59 +02:00
Vicențiu Ciorbaru
9aeb5d01d6 Merge remote-tracking branch 'origin/10.1' into bb-10.2-vicentiu 2017-12-28 19:27:00 +02:00
Sergei Golubchik
5377242fff MDEV-14026 ALTER TABLE ... DELAY_KEY_WRITE=1 creates table copy for partitioned MyISAM table with DATA DIRECTORY/INDEX DIRECTORY options
set data_file_name and index_file_name in HA_CREATE_INFO
before calling check_if_incompatible_data()
2017-12-25 15:18:21 +01:00
Sergei Golubchik
6d8b1bd620 cleanup: ha_partition::update_create_info 2017-12-25 15:18:21 +01:00
Aleksey Midenkov
b55a149194
Timestamp-based versioning for InnoDB [closes #209]
* Removed integer_fields check
* Reworked Vers_parse_info::check_sys_fields()
* Misc renames
* versioned as vers_sys_type_t

* Removed versioned_by_sql(), versioned_by_engine()

versioned() works as before;
versioned(VERS_TIMESTAMP) is versioned_by_sql();
versioned(VERS_TRX_ID) is versioned_by_engine().

* create_tmp_table() fix
* Foreign constraints for timestamp-based
* Range auto-specifier fix
* SQL: 1-row partition rotation fix [fixes #260]
* Fix 'drop system versioning, algorithm=inplace'
2017-12-18 19:03:51 +03:00
Aleksey Midenkov
4624e565f3 System Versioning 1.0 pre6
Merge remote-tracking branch 'mariadb/bb-10.3-temporal-serg' into trunk
2017-12-15 18:12:18 +03:00
Sergei Golubchik
18405e5fd9 Partitioning syntax for versioning
partition by system_time (
    partition p0 history,
    partition pn current
 )
2017-12-14 20:19:14 +01:00
Eugene Kosov
765569602d System Versioning 1.0 pre4
Merge branch '10.3' into trunk
2017-12-14 17:52:08 +03:00
Monty
d4a4185451 MDEV Assertion `partition_id == m_extra_cache_part_id' failed in ha_partition::late_extra_no_cache
The problem was that multi_range_read_info_const() called
multi_range_key_create_key() which changed m_part_spec.start_part,
while there was an activ table scan ongoing.

Fixed by copying and restoring m_part_spec around
multi_range_key_create_calls()
2017-12-12 15:10:53 +02:00
Aleksey Midenkov
79dd77e6ae System Versioning 1.0 pre3
Merge branch '10.3' into trunk
2017-12-11 15:43:41 +03:00
Monty
60df17e95a Remove compiler warnings 2017-12-03 13:58:36 +02:00
Kentoku SHIBA
e53ef202bd Adding direct update/delete to the server and to the partition engine.
Add support for direct update and direct delete requests for spider.
A direct update/delete request handles all qualified rows in a single
operation rather than one row at a time.

Contains Spiral patches:
006_mariadb-10.2.0.direct_update_rows.diff      MDEV-7704
008_mariadb-10.2.0.partition_direct_update.diff MDEV-7706
010_mariadb-10.2.0.direct_update_rows2.diff     MDEV-7708
011_mariadb-10.2.0.aggregate.diff               MDEV-7709
027_mariadb-10.2.0.force_bulk_update.diff       MDEV-7724
061_mariadb-10.2.0.mariadb-10.1.8.diff          MDEV-12870

- The differences compared to the original patches:
  - Most of the parameters of the new functions are unnecessary.  The
    unnecessary parameters have been removed.
  - Changed bit positions for new handler flags upon consideration of
    handler flags not needed by other Spiral patches and handler flags
    merged from MySQL.
  - Added info_push() (Was originally part of bulk access patch)
  - Didn't include code related to handler socket
  - Added HA_CAN_DIRECT_UPDATE_AND_DELETE

Original author: Kentoku SHIBA
First reviewer:  Jacob Mathew
Second reviewer: Michael Widenius
2017-12-03 13:58:36 +02:00
Michael Widenius
d1e4ecec07 Spider fix: Stop searching next when m_top_entry=NO_CURRENT_PART_ID 2017-12-03 13:58:36 +02:00
Michael Widenius
352feb49c4 Cleanups for ha_partition.cc
- Ensure that var= doesn't have a space before =
- Fixed DBUG_PRINT to use %u for unsigned types
- Use "enter" when printing function arguments
- Fixed typos
- Added some extra DBUG_PRINT
- Removed not needed assignment
2017-12-03 13:58:36 +02:00
Monty
3907ff2d24 Fix index scan cleanup in the partition engine.
Spiral Patch 057: 057_mariadb-10.2.0.partition_index_end.diff MDEV-12999

Original author: Kentoku SHIBA
First reviewer:  Jacob Mathew
Second reviewer: Michael Widenius
2017-12-03 13:58:35 +02:00
Monty
f26e14e2d8 Adding option to tell that cmp_ref handler call is expensive
- In Spider, calling cmp_ref() can be very expensive. In ha_partition.cc
  we don't anymore sort rows according to position for the Spider
  engine.
- Removed Spider specific call info(HA_EXTRA_STARTING_ORDERED_INDEX_SCAN)
  from handle_ordered_index_scan(). It's caused performance issues and
  does not change results for queries with ORDER BY.
- The visible effect of this patch is that for some storage engines,
  rows may be returned in a different order if there is no ORDER BY clause.

- Based in Spiral Patch 052:
  052_mariadb-10.2.0.add_partition_skip_pk_sort_for_non_clustered_index
  MDEV-7748
- The major difference from original patch is that there is no variable to
  get the old behaviour.

Other things:
- Optimized ha_partition::cmp_ref() and cmp_part_ids() to make them
  simpler and faster.
- Changed arguments to cmp_key_part_id() to be same as
  cmp_key_rowid_part_id to simplify code.

Original author: Kentoku SHIBA
First reviewer:  Jacob Mathew
Second reviewer: Michael Widenius
2017-12-03 13:58:35 +02:00