Commit graph

2015 commits

Author SHA1 Message Date
Sergei Golubchik
c55c292832 introduce hton->drop_table() method
first step in moving drop table out of the handler.
todo: other methods that don't need an open table

for now hton->drop_table is optional, for backward compatibility
reasons
2020-07-04 01:44:46 +02:00
Monty
5211af1c16 Merge remote-tracking branch 'origin/10.3' into 10.4 2020-07-03 00:35:28 +03:00
Monty
29f9e679ad Don't copy uninitialized bytes when copying varstrings
When using field_conv(), which is called in case of field1=field2 copy in
fill_records(), full varstring's was copied, including unitialized bytes.
This caused valgrind to compilain about usage of unitialized bytes when
using Aria static length records.
Fixed by not using memcpy when copying varstrings but instead just copy
the real bytes.
2020-07-02 14:25:41 +03:00
Monty
60f08dd555 MDEV-22925 ALTER TABLE s3_table ENGINE=Aria can cause failure on slave
When converting a table (test.s3_table) from S3 to another engine, the
following will be logged to the binary log:

DROP TABLE IF EXISTS test.t1;
CREATE OR REPLACE TABLE test.t1 (...) ENGINE=new_engine
INSERT rows to test.t1 in binary-row-log-format

The bug is that the above statements are logged one by one to the binary
log. This means that a fast slave, configured to use the same S3 storage
as the master, would be able to execute the DROP and CREATE from the
binary log before the master has finished the ALTER TABLE.
In this case the slave would ignore the DROP (as it's on a S3 table) but
it will stop on CREATE of the local tale, as the table is still exists in
S3. The REPLACE part will be ignored by the slave as it can't touch the
S3 table.

The fix is to ensure that all the above statements is written to binary
log AFTER the table has been deleted from S3.
2020-06-19 12:03:13 +03:00
Monty
10b88deb74 Changes needed for ColumnStore and insert cache
MCOL-3875 Columnstore write cache

The main change is to change thr_lock function get_status to
return a value that indicates we have to abort the lock.

Other thing:
- Made start_bulk_insert() and end_bulk_insert() protected so that the
  insert cache can use these
2020-06-14 19:39:42 +03:00
Monty
5bcb1d6532 MDEV-11412 Ensure that table is truly dropped when using DROP TABLE
The used code is largely based on code from Tencent

The problem is that in some rare cases there may be a conflict between .frm
files and the files in the storage engine. In this case the DROP TABLE
was not able to properly drop the table.

Some MariaDB/MySQL forks has solved this by adding a FORCE option to
DROP TABLE. After some discussion among MariaDB developers, we concluded
that users expects that DROP TABLE should always work, even if the
table would not be consistent. There should not be a need to use a
separate keyword to ensure that the table is really deleted.

The used solution is:
- If a .frm table doesn't exists, try dropping the table from all storage
  engines.
- If the .frm table exists but the table does not exist in the engine
  try dropping the table from all storage engines.
- Update storage engines using many table files (.CVS, MyISAM, Aria) to
  succeed with the drop even if some of the files are missing.
- Add HTON_AUTOMATIC_DELETE_TABLE to handlerton's where delete_table()
  is not needed and always succeed. This is used by ha_delete_table_force()
  to know which handlers to ignore when trying to drop a table without
  a .frm file.

The disadvantage of this solution is that a DROP TABLE on a non existing
table will be a bit slower as we have to ask all active storage engines
if they know anything about the table.

Other things:
- Added a new flag MY_IGNORE_ENOENT to my_delete() to not give an error
  if the file doesn't exist. This simplifies some of the code.
- Don't clear thd->error in ha_delete_table() if there was an active
  error. This is a bug fix.
- handler::delete_table() will not abort if first file doesn't exists.
  This is bug fix to handle the case when a drop table was aborted in
  the middle.
- Cleaned up mysql_rm_table_no_locks() to ensure that if_exists uses
  same code path as when it's not used.
- Use non_existing_Table_error() to detect if table didn't exists.
  Old code used different errors tests in different position.
- Table_triggers_list::drop_all_triggers() now drops trigger file if
  it can't be parsed instead of leaving it hanging around (bug fix)
- InnoDB doesn't anymore print error about .frm file out of sync with
  InnoDB directory if .frm file does not exists. This change was required
  to be able to try to drop an InnoDB file when .frm doesn't exists.
- Fixed bug in mi_delete_table() where the .MYD file would not be dropped
  if the .MYI file didn't exists.
- Fixed memory leak in Mroonga when deleting non existing table
- Fixed memory leak in Connect when deleting non existing table

Bugs fixed introduced by the original version of this commit:
MDEV-22826 Presence of Spider prevents tables from being force-deleted from
           other engines
2020-06-14 19:39:42 +03:00
Sergei Petrunia
d7d80689b3 MDEV-15101: Stop ANALYZE TABLE from flushing table definition cache
Apply this patch from Percona Server (amended for 10.5):

commit cd7201514fee78aaf7d3eb2b28d2573c76f53b84
Author: Laurynas Biveinis <laurynas.biveinis@gmail.com>
Date:   Tue Nov 14 06:34:19 2017 +0200

    Fix bug 1704195 / 87065 / TDB-83 (Stop ANALYZE TABLE from flushing table definition cache)

    Make ANALYZE TABLE stop flushing affected tables from the table
    definition cache, which has the effect of not blocking any subsequent
    new queries involving the table if there's a parallel long-running
    query:

    - new table flag HA_ONLINE_ANALYZE, return it for InnoDB and TokuDB
      tables;
    - in mysql_admin_table, if we are performing ANALYZE TABLE, and the
      table flag is set, do not remove the table from the table
      definition cache, do not invalidate query cache;
    - in partitioning handler, refresh the query optimizer statistics
      after ANALYZE if the underlying handler supports HA_ONLINE_ANALYZE;
    - new testcases main.percona_nonflushing_analyze_debug,
      parts.percona_nonflushing_abalyze_debug and a supporting debug sync
      point.

    For TokuDB, this change exposes bug TDB-83 (Index cardinality stats
    updated for handler::info(HA_STATUS_CONST), not often enough for
    tokudb_cardinality_scale_percent). TokuDB may return different
    rec_per_key values depending on dynamic variable
    tokudb_cardinality_scale_percent value. The server does not have a way
    of knowing that changing this variable invalidates the previous
    rec_per_key values in any opened table shares, and so does not call
    info(HA_STATUS_CONST) again. Fix by updating rec_per_key for both
    HA_STATUS_CONST and HA_STATUS_VARIABLE. This also forces a re-record
    of tokudb.bugs.db756_card_part_hash_1_pick, with the new output
    seeming to be more correct.
2020-06-12 20:29:05 +03:00
Sergei Golubchik
89a33303c4 remove dead code
reduce the amount of engine-specific code in the server,
particularly as it does not serve any purpose now.

may be needed for VP engine,
to be reconsidered in MDEV-7795
2020-06-09 14:32:43 +02:00
Marko Mäkelä
4a0b56f604 Merge 10.4 into 10.5 2020-05-31 10:28:59 +03:00
Marko Mäkelä
6da14d7b4a Merge 10.3 into 10.4 2020-05-30 11:04:27 +03:00
Monty
df4ab26a6b SHOW TABLE STATUS now shows if an Aria table is transactional or not
This change also affects information_schema.tables

The create table option "transactional=0 | 1" is now always shown for
storage engines that supports both transactional/crash safe tables and
non transactional tables.

Before this patch the transactional=... option was only shown if the user
specified transactional=... in the CREATE TABLE or ALTER TABLE statement.
The reason for the change was to be able to make it easy to know if an Aria
table is transactional or not.
2020-05-29 22:47:37 +03:00
Marko Mäkelä
dad7a8ee7d Merge 10.2 into 10.3 2020-05-27 17:10:39 +03:00
Monty
403dacf6a9 Fixed crash in aria recovery when using bulk insert
MDEV-20578 Got error 126 when executing undo undo_key_delete
upon Aria crash recovery

The crash happens in this scenario:
- Table with unique keys and non unique keys
- Batch insert (LOAD DATA or INSERT ... SELECT) with REPLACE
- Some insert succeeds followed by duplicate key error

In the above scenario the table gets corrupted.

The bug was that we don't generate any undo entry for the
failed insert as the whole insert can be ignored by undo.
The code did however not take into account that when bulk
insert is used, we would write cached keys to the file on
failure and undo would wrongly ignore these.

Fixed by moving the writing of the cache keys after we write
the aborted-insert event to the log.
2020-05-26 20:05:17 +03:00
Monty
4102f1589c Aria will now register it's transactions
MDEV-22531 Remove maria::implicit_commit()
MDEV-22607 Assertion `ha_info->ht() != binlog_hton' failed in
           MYSQL_BIN_LOG::unlog_xa_prepare

From the handler point of view, Aria now looks like a transactional
engine. One effect of this is that we don't need to call
maria::implicit_commit() anymore.

This change also forces the server to call trans_commit_stmt() after doing
any read or writes to system tables.  This work will also make it easier
to later allow users to have system tables in other engines than Aria.

To handle the case that Aria doesn't support rollback, a new
handlerton flag, HTON_NO_ROLLBACK, was added to engines that has
transactions without rollback (for the moment only binlog and Aria).

Other things
- Moved freeing of MARIA_SHARE to a separate function as the MARIA_SHARE
  can be still part of a transaction even if the table has closed.
- Changed Aria checkpoint to use the new MARIA_SHARE free function. This
  fixes a possible memory leak when using S3 tables
- Changed testing of binlog_hton to instead test for HTON_NO_ROLLBACK
- Removed checking of has_transaction_manager() in handler.cc as we can
  assume that as the transaction was started by the engine, it does
  support transactions.
- Added new class 'start_new_trans' that can be used to start indepdendent
  sub transactions, for example while reading mysql.proc, using help or
  status tables etc.
- open_system_tables...() and open_proc_table_for_Read() doesn't anymore
  take a Open_tables_backup list. This is now handled by 'start_new_trans'.
- Split thd::has_transactions() to thd::has_transactions() and
  thd::has_transactions_and_rollback()
- Added handlerton code to free cached transactions objects.
  Needed by InnoDB.

squash! 2ed35999f2a2d84f1c786a21ade5db716b6f1bbc
2020-05-23 12:29:10 +03:00
Monty
7ae812cf2c Fix that BACKUP STAGE BLOCK_COMMIT blocks commit to the Aria engine
MDEV-22468 BACKUP STAGE BLOCK_COMMIT should block commit in the Aria engine

This is needed to ensure that mariabackup works properly with Aria tables

This code ads new calls to ha_maria::implicit_commit(). These will be
deleted by MDEV-22531 Remove maria::implicit_commit().
2020-05-23 12:29:10 +03:00
Sergei Golubchik
13038e4705 Merge branch '10.4' into 10.5 2020-05-09 20:43:36 +02:00
Sergei Petrunia
8d85715d50 MDEV-21794: Optimizer flag rowid_filter leads to long query
Rowid Filter check is just like Index Condition Pushdown check: before
we check the filter, we must check if we have walked out of the range
we are scanning. (If we did, we should return, and not continue the scan).

Consequences of this:
- Rowid filtering doesn't work for keys that have partially-covered
  blob columns (just like Index Condition Pushdown)
- The rowid filter function has three return values: CHECK_POS (passed)
  CHECK_NEG (filtered out), CHECK_OUT_OF_RANGE.

All of the above is implemented in this patch
2020-05-07 12:27:17 +02:00
Sergei Golubchik
67aaf51cf9 cleanup: ha_external_unlock() helper
as mentioned in f9f33b85be and generally to make it
easier to talk about
2020-05-05 19:41:12 +02:00
Marko Mäkelä
496d0372ef Merge 10.4 into 10.5 2020-04-29 15:40:51 +03:00
Marko Mäkelä
0632b8034b Merge 10.3 into 10.4 2020-04-29 09:05:15 +03:00
Thirunarayanan Balathandayuthapani
c755974775 MDEV-19611 INPLACE ALTER does not fail on bad implicit default value
- Inplace alter shouldn't set default date column as '0000-00-00' when
table is not empty. So mysql_inplace_alter_table() copied
alter_ctx.error_if_not_empty to a new field of Alter_inplace_info.
In ha_innobase::check_if_supported_inplace_alter() should check the
error_if_not_empty flag and return INPLACE_NOT_SUPPORTED if the table
is not empty
2020-04-28 11:59:32 +05:30
Monty
eca5c2c67f Added support for more functions when using partitioned S3 tables
MDEV-22088 S3 partitioning support

All ALTER PARTITION commands should now work on S3 tables except

REBUILD PARTITION
TRUNCATE PARTITION
REORGANIZE PARTITION

In addition, PARTIONED S3 TABLES can also be replicated.
This is achived by storing the partition tables .frm and .par file on S3
for partitioned shared (S3) tables.

The discovery methods are enchanced by allowing engines that supports
discovery to also support of the partitioned tables .frm and .par file

Things in more detail

- The .frm and .par files of partitioned tables are stored in S3 and kept
  in sync.
- Added hton callback create_partitioning_metadata to inform handler
  that metadata for a partitoned file has changed
- Added back handler::discover_check_version() to be able to check if
  a table's or a part table's definition has changed.
- Added handler::check_if_updates_are_ignored(). Needed for partitioning.
- Renamed rebind() -> rebind_psi(), as it was before.
- Changed CHF_xxx hadnler flags to an enum
- Changed some checks from using table->file->ht to use
  table->file->partition_ht() to get discovery to work with partitioning.
- If TABLE_SHARE::init_from_binary_frm_image() fails, ensure that we
  don't leave any .frm or .par files around.
- Fixed that writefrm() doesn't leave unusable .frm files around
- Appended extension to path for writefrm() to be able to reuse to function
  for creating .par files.
- Added DBUG_PUSH("") to a a few functions that caused a lot of not
  critical tracing.
2020-04-19 17:33:51 +03:00
Marko Mäkelä
af91266498 Merge 10.3 into 10.4
In main.index_merge_myisam we remove the test that was added in
commit a2d24def8c because
it duplicates the test case that was added in
commit 5af12e4635.
2020-04-16 12:12:26 +03:00
Marko Mäkelä
84db10f27b Merge 10.2 into 10.3 2020-04-15 09:56:03 +03:00
Sergei Golubchik
fcd84da5f1 MDEV-22218 InnoDB: Failing assertion: node->pcur->rel_pos == BTR_PCUR_ON upon LOAD DATA with NO_BACKSLASH_ESCAPES in SQL_MODE and unique blob in table
`inited == NONE` at the initialization time does not always mean
that it'll be `NONE` later, at the execution time. Use a more complex
caller-specific logic to decide whether to create a cloned lookup handler.

Besides LOAD (as in the original bug report) make sure that all
prepare_for_insert() invocations are covered by tests. Add tests for
CREATE ... SELECT, multi-UPDATE, and multi-DELETE.

Don't enable write cache with long uniques.
2020-04-12 22:10:57 +02:00
Vlad Lesin
5836191c8f MDEV-21168: Active XA transactions stop slave from working after backup
was restored.

Optionally rollback prepared XA's on "mariabackup --prepare".

The fix MUST NOT be ported on 10.5+, as MDEV-742 fix solves the issue for
slaves.
2020-04-07 15:05:38 +03:00
Nikita Malyavin
259fb1cbed MDEV-16978 Application-time periods: WITHOUT OVERLAPS
* The overlaps check is implemented on a handler level per row command.
  It creates a separate cursor (actually, another handler instance) and
  caches it inside the original handler, when ha_update_row or
  ha_insert_row is issued. Cursor closes on unlocking the handler.

* Containing the same key in index means unique constraint violation
  even in usual terms. So we fetch left and right neighbours and check
  that they have same key prefix, excluding from the key only the period part.
  If it doesnt match, then there's no such neighbour, and the check passes.
  Otherwise, we check if this neighbour intersects with the considered key.

* The check does not introduce new error and fails with ER_DUPP_KEY error.
  This might break REPLACE workflow and should be fixed separately
2020-03-31 17:42:34 +02:00
Sergei Golubchik
0515577d12 cleanup: prepare "update_handler" for WITHOUT OVERLAPS
* rename to a generic name
* move remaning initializations from query exec to prepare time
* simplify/unify key handling in open_table_from_share and delayed
* remove dead code
* move tests where they belong
2020-03-31 17:42:34 +02:00
Sergei Golubchik
27bf97aa00 cleanup: dead code, comments, avoid current_thd 2020-03-31 17:42:33 +02:00
Monty
eb483c5181 Updated optimizer costs in multi_range_read_info_const() and sql_select.cc
- multi_range_read_info_const now uses the new records_in_range interface
- Added handler::avg_io_cost()
- Don't calculate avg_io_cost() in get_sweep_read_cost if avg_io_cost is
  not 1.0.  In this case we trust the avg_io_cost() from the handler.
- Changed test_quick_select to use TIME_FOR_COMPARE instead of
  TIME_FOR_COMPARE_IDX to align this with the rest of the code.
- Fixed bug when using test_if_cheaper_ordering where we didn't use
  keyread if index was changed
- Fixed a bug where we didn't use index only read when using order-by-index
- Added keyread_time() to HEAP.
  The default keyread_time() was optimized for blocks and not suitable for
  HEAP. The effect was the HEAP prefered table scans over ranges for btree
  indexes.
- Fixed get_sweep_read_cost() for HEAP tables
- Ensure that range and ref have same cost for simple ranges
  Added a small cost (MULTI_RANGE_READ_SETUP_COST) to ranges to ensure
  we favior ref for range for simple queries.
- Fixed that matching_candidates_in_table() uses same number of records
  as the rest of the optimizer
- Added avg_io_cost() to JT_EQ_REF cost. This helps calculate the cost for
  HEAP and temporary tables better. A few tests changed because of this.
- heap::read_time() and heap::keyread_time() adjusted to not add +1.
  This was to ensure that handler::keyread_time() doesn't give
  higher cost for heap tables than for normal tables. One effect of
  this is that heap and derived tables stored in heap will prefer
  key access as this is now regarded as cheap.
- Changed cost for index read in sql_select.cc to match
  multi_range_read_info_const(). All index cost calculation is now
  done trough one function.
- 'ref' will now use quick_cost for keys if it exists. This is done
  so that for '=' ranges, 'ref' is prefered over 'range'.
- scan_time() now takes avg_io_costs() into account
- get_delayed_table_estimates() uses block_size and avg_io_cost()
- Removed default argument to test_if_order_by_key(); simplifies code
2020-03-27 03:58:32 +02:00
Monty
f36ca142f7 Added page_range to records_in_range() to improve range statistics
Prototype change:
-  virtual ha_rows records_in_range(uint inx, key_range *min_key,
-                                   key_range *max_key)
+  virtual ha_rows records_in_range(uint inx, const key_range *min_key,
+                                   const key_range *max_key,
+                                   page_range *res)

The handler can ignore the page_range parameter. In the case the handler
updates the parameter, the optimizer can deduce the following:
- If previous range's last key is on the same block as next range's first
  key
- If the current key range is in one block
- We can also assume that the first and last block read are cached!
  This can be used for a better calculation of IO seeks when we
  estimate the cost of a range index scan.

The parameter is fully implemented for MyISAM, Aria and InnoDB.
A separate patch will update handler::multi_range_read_info_const() to
take the benefits of this change and also remove the double
records_in_range() calls that are not anymore needed.
2020-03-27 03:54:45 +02:00
Monty
37393bea23 Replace handler::primary_key_is_clustered() with handler::pk_is_clustering_key()
This was done to both simplify the code and also to be easier to handle
storage engines that are clustered on some other index than the primary
key.

As pk_is_clustering_key() and is_clustering_key now are using only
index_flags, these where removed from all storage engines.
2020-03-24 21:00:04 +02:00
Monty
91ab42a823 Clean up and speed up interfaces for binary row logging
MDEV-21605 Clean up and speed up interfaces for binary row logging
MDEV-21617 Bug fix for previous version of this code

The intention is to have as few 'if' as possible in ha_write() and
related functions. This is done by pre-calculating once per statement the
row_logging state for all tables.

Benefits are simpler and faster code both when binary logging is disabled
and when it's enabled.

Changes:
- Added handler->row_logging to make it easy to check it table should be
  row logged. This also made it easier to disabling row logging for system,
  internal and temporary tables.
- The tables row_logging capabilities are checked once per "statements
  that updates tables" in THD::binlog_prepare_for_row_logging() which
  is called when needed from THD::decide_logging_format().
- Removed most usage of tmp_disable_binlog(), reenable_binlog() and
  temporary saving and setting of thd->variables.option_bits.
- Moved checks that can't change during a statement from
  check_table_binlog_row_based() to check_table_binlog_row_based_internal()
- Removed flag row_already_logged (used by sequence engine)
- Moved binlog_log_row() to a handler::
- Moved write_locked_table_maps() to THD::binlog_write_table_maps() as
  most other related binlog functions are in THD.
- Removed binlog_write_table_map() and binlog_log_row_internal() as
  they are now obsolete as 'has_transactions()' is pre-calculated in
  prepare_for_row_logging().
- Remove 'is_transactional' argument from binlog_write_table_map() as this
  can now be read from handler.
- Changed order of 'if's in handler::external_lock() and wsrep_mysqld.h
  to first evaluate fast and likely cases before more complex ones.
- Added error checking in ha_write_row() and related functions if
  binlog_log_row() failed.
- Don't clear check_table_binlog_row_based_result in
  clear_cached_table_binlog_row_based_flag() as it's not needed.
- THD::clear_binlog_table_maps() has been replaced with
  THD::reset_binlog_for_next_statement()
- Added 'MYSQL_OPEN_IGNORE_LOGGING_FORMAT' flag to open_and_lock_tables()
  to avoid calculating of binary log format for internal opens. This flag
  is also used to avoid reading statistics tables for internal tables.
- Added OPTION_BINLOG_LOG_OFF as a simple way to turn of binlog temporary
  for create (instead of using THD::sql_log_bin_off.
- Removed flag THD::sql_log_bin_off (not needed anymore)
- Speed up THD::decide_logging_format() by remembering if blackhole engine
  is used and avoid a loop over all tables if it's not used
  (the common case).
- THD::decide_logging_format() is not called anymore if no tables are used
  for the statement. This will speed up pure stored procedure code with
  about 5%+ according to some simple tests.
- We now get annotated events on slave if a CREATE ... SELECT statement
  is transformed on the slave from statement to row logging.
- In the original code, the master could come into a state where row
  logging is enforced for all future events if statement could be used.
  This is now partly fixed.

Other changes:
- Ensure that all tables used by a statement has query_id set.
- Had to restore the row_logging flag for not used tables in
  THD::binlog_write_table_maps (not normal scenario)
- Removed injector::transaction::use_table(server_id_type sid, table tbl)
  as it's not used.
- Cleaned up set_slave_thread_options()
- Some more DBUG_ENTER/DBUG_RETURN, code comments and minor indentation
  changes.
- Ensure we only call THD::decide_logging_format_low() once in
  mysql_insert() (inefficiency).
- Don't annotate INSERT DELAYED
- Removed zeroing pos_in_table_list in THD::open_temporary_table() as it's
  already 0
2020-03-24 21:00:03 +02:00
Monty
4ef437558a Improve update handler (long unique keys on blobs)
MDEV-21606 Improve update handler (long unique keys on blobs)
MDEV-21470 MyISAM and Aria start_bulk_insert doesn't work with long unique
MDEV-21606 Bug fix for previous version of this code
MDEV-21819 2 Assertion `inited == NONE || update_handler != this'

- Move update_handler from TABLE to handler
- Move out initialization of update handler from ha_write_row() to
  prepare_for_insert()
- Fixed that INSERT DELAYED works with update handler
- Give an error if using long unique with an autoincrement column
- Added handler function to check if table has long unique hash indexes
- Disable write cache in MyISAM and Aria when using update_handler as
  if cache is used, the row will not be inserted until end of statement
  and update_handler would not find conflicting rows.
- Removed not used handler argument from
  check_duplicate_long_entries_update()
- Syntax cleanups
  - Indentation fixes
  - Don't use single character indentifiers for arguments
2020-03-24 21:00:02 +02:00
Monty
736998cb75 Cleanups & indentation changes
- Only indentation changes in sql_rename.cc
- Ignore some WSREP error messages when there isn't a internet connection
- Force restart of stat_tables_part.test to make result stable
- Fixed compiler warnings in CONNECT
2020-03-24 21:00:02 +02:00
Monty
6a9e24d046 Added support for replication for S3
MDEV-19964 S3 replication support

Added new configure options:
s3_slave_ignore_updates
"If the slave has shares same S3 storage as the master"

s3_replicate_alter_as_create_select
"When converting S3 table to local table, log all rows in binary log"

This allows on to configure slaves to have the S3 storage shared or
independent from the master.

Other thing:
Added new session variable '@@sql_if_exists' to force IF_EXIST to DDL's.
2020-03-24 21:00:02 +02:00
Sergey Vojtovich
da82e75901 handler::rebind()
- rename PFS specific rebind_psi() to generic rebind()
- call rebind independently of PFS compilation status
- allow rebind() return an error
2020-03-24 20:47:41 +02:00
Monty
305cffebab merge 10.4 to 10.5 2020-03-18 12:00:38 +02:00
Monty
1242eb3d32 Removed double records_in_range calls from multi_range_read_info_const
This was to remove a performance regression between 10.3 and 10.4
In 10.5 we will have a better implementation of records_in_range
that will enable us to get more statistics.
This change was not done in 10.4 because the 10.5 will be part of
a larger change that is not suitable for the GA 10.4 version

Other things:
- Changed default handler block_size to 8192 to fix things statistics
  for engines that doesn't set the block size.
- Fixed a bug in spider when using multiple part const ranges
  (Patch from Kentoku)
2020-03-17 02:16:48 +02:00
Andrei Elkin
c8ae357341 MDEV-742 XA PREPAREd transaction survive disconnect/server restart
Lifted long standing limitation to the XA of rolling it back at the
transaction's
connection close even if the XA is prepared.

Prepared XA-transaction is made to sustain connection close or server
restart.
The patch consists of

    - binary logging extension to write prepared XA part of
      transaction signified with
      its XID in a new XA_prepare_log_event. The concusion part -
      with Commit or Rollback decision - is logged separately as
      Query_log_event.
      That is in the binlog the XA consists of two separate group of
      events.

      That makes the whole XA possibly interweaving in binlog with
      other XA:s or regular transaction but with no harm to
      replication and data consistency.

      Gtid_log_event receives two more flags to identify which of the
      two XA phases of the transaction it represents. With either flag
      set also XID info is added to the event.

      When binlog is ON on the server XID::formatID is
      constrained to 4 bytes.

    - engines are made aware of the server policy to keep up user
      prepared XA:s so they (Innodb, rocksdb) don't roll them back
      anymore at their disconnect methods.

    - slave applier is refined to cope with two phase logged XA:s
      including parallel modes of execution.

This patch does not address crash-safe logging of the new events which
is being addressed by MDEV-21469.

CORNER CASES: read-only, pure myisam, binlog-*, @@skip_log_bin, etc

Are addressed along the following policies.
1. The read-only at reconnect marks XID to fail for future
   completion with ER_XA_RBROLLBACK.

2. binlog-* filtered XA when it changes engine data is regarded as
   loggable even when nothing got cached for binlog.  An empty
   XA-prepare group is recorded. Consequent Commit-or-Rollback
   succeeds in the Engine(s) as well as recorded into binlog.

3. The same applies to the non-transactional engine XA.

4. @@skip_log_bin=OFF does not record anything at XA-prepare
   (obviously), but the completion event is recorded into binlog to
   admit inconsistency with slave.

The following actions are taken by the patch.

At XA-prepare:
   when empty binlog cache - don't do anything to binlog if RO,
   otherwise write empty XA_prepare (assert(binlog-filter case)).

At Disconnect:
   when Prepared && RO (=> no binlogging was done)
     set Xid_cache_element::error := ER_XA_RBROLLBACK
     *keep* XID in the cache, and rollback the transaction.

At XA-"complete":
   Discover the error, if any don't binlog the "complete",
   return the error to the user.

Kudos
-----
Alexey Botchkov took to drive this work initially.
Sergei Golubchik, Sergei Petrunja, Marko Mäkelä provided a number of
good recommendations.
Sergei Voitovich made a magnificent review and improvements to the code.
They all deserve a bunch of thanks for making this work done!
2020-03-14 22:45:48 +02:00
Sergei Golubchik
91d1588d30 Merge branch 'github/10.5' into 10.5 2020-03-14 09:52:35 +01:00
Marko Mäkelä
d82ac8d374 MDEV-21907: Fix some -Wconversion outside InnoDB
Some .c and .cc files are compiled as part of Mariabackup.
Enabling -Wconversion for InnoDB would also enable it for
Mariabackup. The .h files are being included during
InnoDB or Mariabackup compilation.

Notably, GCC 5 (but not GCC 4 or 6 or later versions)
would report -Wconversion for x|=y when the type is
unsigned char. So, we will either write x=(uchar)(x|y)
or disable the -Wconversion warning for GCC 5.

bitmap_set_bit(), bitmap_flip_bit(), bitmap_clear_bit(), bitmap_is_set():
Always implement as inline functions.
2020-03-12 19:44:52 +02:00
Oleksandr Byelkin
fad47df995 Merge branch '10.4' into 10.5 2020-03-11 17:52:49 +01:00
Sergei Golubchik
cbede21d0d cleanup: pass trxid by value 2020-03-10 19:24:23 +01:00
Sergei Golubchik
0d837e8153 perfschema table io instrumentation related changes 2020-03-10 19:24:23 +01:00
Eugene Kosov
7ccc1710a0 cleanup: key parts comparison
Engine specific code moved to engine.
2020-02-18 22:53:28 +03:00
Marko Mäkelä
28c89b7151 Merge 10.4 into 10.5 2019-12-16 07:47:17 +02:00
Oleksandr Byelkin
a15234bf4b Merge branch '10.3' into 10.4 2019-12-09 15:09:41 +01:00
Faustin Lammler
2df2238cb8 Lintian complains on spelling error
The lintian check complains on spelling error:
https://salsa.debian.org/mariadb-team/mariadb-10.3/-/jobs/95739
2019-12-02 12:41:13 +02:00
Aleksey Midenkov
8ed646f071 Merge 10.4 into 10.5 2019-12-02 13:35:54 +03:00