* rename to a generic name
* move remaning initializations from query exec to prepare time
* simplify/unify key handling in open_table_from_share and delayed
* remove dead code
* move tests where they belong
Sergei's commit ac6b3c4430 implemented handler status counters
compensation for underlying handlers like ha_partition.
`index_read_idx_map` is missing there, but it should have been fixed as
well (proof: ha_partition::index_read_idx_map never calls
ha_partition::index_read_map).
Note: all this compensation logic could be broken for subpartitions! (We
can experience double decrement)
partition does rebuild
- In ha_innobase::commit_inplace_alter_table() assumes that all partition
should do the same kind of alter operations. During DDL, if one partition
requires table rebuild and other partition doesn't need rebuild
then all partition should be forced to rebuild.
Prototype change:
- virtual ha_rows records_in_range(uint inx, key_range *min_key,
- key_range *max_key)
+ virtual ha_rows records_in_range(uint inx, const key_range *min_key,
+ const key_range *max_key,
+ page_range *res)
The handler can ignore the page_range parameter. In the case the handler
updates the parameter, the optimizer can deduce the following:
- If previous range's last key is on the same block as next range's first
key
- If the current key range is in one block
- We can also assume that the first and last block read are cached!
This can be used for a better calculation of IO seeks when we
estimate the cost of a range index scan.
The parameter is fully implemented for MyISAM, Aria and InnoDB.
A separate patch will update handler::multi_range_read_info_const() to
take the benefits of this change and also remove the double
records_in_range() calls that are not anymore needed.
This was done to both simplify the code and also to be easier to handle
storage engines that are clustered on some other index than the primary
key.
As pk_is_clustering_key() and is_clustering_key now are using only
index_flags, these where removed from all storage engines.
MDEV-21605 Clean up and speed up interfaces for binary row logging
MDEV-21617 Bug fix for previous version of this code
The intention is to have as few 'if' as possible in ha_write() and
related functions. This is done by pre-calculating once per statement the
row_logging state for all tables.
Benefits are simpler and faster code both when binary logging is disabled
and when it's enabled.
Changes:
- Added handler->row_logging to make it easy to check it table should be
row logged. This also made it easier to disabling row logging for system,
internal and temporary tables.
- The tables row_logging capabilities are checked once per "statements
that updates tables" in THD::binlog_prepare_for_row_logging() which
is called when needed from THD::decide_logging_format().
- Removed most usage of tmp_disable_binlog(), reenable_binlog() and
temporary saving and setting of thd->variables.option_bits.
- Moved checks that can't change during a statement from
check_table_binlog_row_based() to check_table_binlog_row_based_internal()
- Removed flag row_already_logged (used by sequence engine)
- Moved binlog_log_row() to a handler::
- Moved write_locked_table_maps() to THD::binlog_write_table_maps() as
most other related binlog functions are in THD.
- Removed binlog_write_table_map() and binlog_log_row_internal() as
they are now obsolete as 'has_transactions()' is pre-calculated in
prepare_for_row_logging().
- Remove 'is_transactional' argument from binlog_write_table_map() as this
can now be read from handler.
- Changed order of 'if's in handler::external_lock() and wsrep_mysqld.h
to first evaluate fast and likely cases before more complex ones.
- Added error checking in ha_write_row() and related functions if
binlog_log_row() failed.
- Don't clear check_table_binlog_row_based_result in
clear_cached_table_binlog_row_based_flag() as it's not needed.
- THD::clear_binlog_table_maps() has been replaced with
THD::reset_binlog_for_next_statement()
- Added 'MYSQL_OPEN_IGNORE_LOGGING_FORMAT' flag to open_and_lock_tables()
to avoid calculating of binary log format for internal opens. This flag
is also used to avoid reading statistics tables for internal tables.
- Added OPTION_BINLOG_LOG_OFF as a simple way to turn of binlog temporary
for create (instead of using THD::sql_log_bin_off.
- Removed flag THD::sql_log_bin_off (not needed anymore)
- Speed up THD::decide_logging_format() by remembering if blackhole engine
is used and avoid a loop over all tables if it's not used
(the common case).
- THD::decide_logging_format() is not called anymore if no tables are used
for the statement. This will speed up pure stored procedure code with
about 5%+ according to some simple tests.
- We now get annotated events on slave if a CREATE ... SELECT statement
is transformed on the slave from statement to row logging.
- In the original code, the master could come into a state where row
logging is enforced for all future events if statement could be used.
This is now partly fixed.
Other changes:
- Ensure that all tables used by a statement has query_id set.
- Had to restore the row_logging flag for not used tables in
THD::binlog_write_table_maps (not normal scenario)
- Removed injector::transaction::use_table(server_id_type sid, table tbl)
as it's not used.
- Cleaned up set_slave_thread_options()
- Some more DBUG_ENTER/DBUG_RETURN, code comments and minor indentation
changes.
- Ensure we only call THD::decide_logging_format_low() once in
mysql_insert() (inefficiency).
- Don't annotate INSERT DELAYED
- Removed zeroing pos_in_table_list in THD::open_temporary_table() as it's
already 0
MDEV-21606 Improve update handler (long unique keys on blobs)
MDEV-21470 MyISAM and Aria start_bulk_insert doesn't work with long unique
MDEV-21606 Bug fix for previous version of this code
MDEV-21819 2 Assertion `inited == NONE || update_handler != this'
- Move update_handler from TABLE to handler
- Move out initialization of update handler from ha_write_row() to
prepare_for_insert()
- Fixed that INSERT DELAYED works with update handler
- Give an error if using long unique with an autoincrement column
- Added handler function to check if table has long unique hash indexes
- Disable write cache in MyISAM and Aria when using update_handler as
if cache is used, the row will not be inserted until end of statement
and update_handler would not find conflicting rows.
- Removed not used handler argument from
check_duplicate_long_entries_update()
- Syntax cleanups
- Indentation fixes
- Don't use single character indentifiers for arguments
- Flag ALTER_STORED_COLUMN_TYPE set while doing varchar extension
for partition table. Basically all partition supports
can_be_converted_by_engine() then it should be set to
ALTER_COLUMN_TYPE_CHANGE_BY_ENGINE.
join_cache_level=6+
The patch fixes two similar bugs in the commit 8eeb689e9f
that added multi_range_read support to partitions. The commit opened
a possibility to join a partition table using BKA+MRR. However in some
cases it could lead to wrong results or even crashes.
This could happened when
- index condition pushdown was used to join the table or
- the joined table was an inner table of an outer join and 'not exist'
optimization was applied or
- the join table was the inner table of a semi-join and the first match
optimization was applied
The bugs were in the code of the call-back functions
- partition_multi_range_key_skip_record() and
- partition_multi_range_key_skip_index_tuple().
Each of this function consist only of an invocation of another function.
Yet a wrong parameter was passed at this invocation.
The fix was suggested by Sergey Petrunia and it is apparently in line
with original design.
The corresponding comprehensive test cases demonstrating the problems
caused by the bugs were constructed by me.
Incorrect assertion of EXTRA_CACHE for
HA_EXTRA_PREPARE_FOR_UPDATE. The latter is related to read cache, but
must operate without it as a noop.
Related to Bug#55458 and MDEV-20441.
LIMIT history partitions cannot be checked by existing algorithm of
check_misplaced_rows() because working history partition is
incremented each time another one is filled. The existing algorithm
gets record and tries to decide partition id for it by
get_partition_id(). For LIMIT history it will just get first
non-filled partition.
To fix such partitions it is required to do REBUILD instead of REPAIR.
MDEV-18957 UPDATE with LIMIT clause is wrong for versioned partitioned tables
UPDATE, DELETE: replace linear search of current/historical records
with vers_setup_conds().
Additional DML cases in view.test
Fix partitioning and DS-MRR to work together
- In ha_partition::index_end(): take into account that ha_innobase (and
other engines using DS-MRR) will have inited=RND when initialized for
DS-MRR scan.
- In ha_partition::multi_range_read_next(): if the MRR scan is using
HA_MRR_NO_ASSOCIATION mode, it is not guaranteed that the partition's
handler will store anything into *range_info.
- In DsMrr_impl::choose_mrr_impl(): ha_partition will inquire partitions
about how much memory their MRR implementation needs by passing
*buffer_size=0. DS-MRR code didn't know about this (actually it used
uint for buffer size calculation and would have an under-flow).
Returning *buffer_size=0 made ha_partition assume that partitions do
not need MRR memory and pass the same buffer to each of them.
Now, this is fixed. If DS-MRR gets *buffer_size=0, it will return
the amount of buffer space needed, but not more than about
@@mrr_buffer_size.
* Fix ha_{innobase,maria,myisam}::clone. If ha_partition uses MRR on its
partitions, and partition use DS-MRR, the code will call handler->clone
with TABLE (*NOT partition*) name as an argument.
DS-MRR has no way of knowing the partition name, so the solution was
to have the ::clone() function for the affected storage engine to ignore
the name argument and get it elsewhere.
Partition table with the AUTO_INCREMENT column we ahve to check if the
max value is properly loaded. So we need to open all tables in INSERT
PARTITION statement if necessary. Also we need to check if some
tables are pruned away and not count the max autoincrement in this case.
- Note that some issues was also fixed in 10.2 and 10.4. I also fixed them
here to be able to continue with making 10.5 valgrind safe again
- Disable connection threads warnings when doing shutdown
The MDEV-20265 commit e746f451d5
introduces DBUG_ASSERT(right_op == r_tbl) in
st_select_lex::add_cross_joined_table(), and that assertion would
fail in several tests that exercise joins. That commit was skipped
in this merge, and a separate fix of MDEV-20265 will be necessary in 10.4.
Exclude SELECT and INSERT SELECT from vers_set_hist_part(). We cannot
likewise exclude REPLACE SELECT because it may REPLACE into itself
(and REPLACE generates history).
INSERT also does not generate history, but we have history
modification setting which might be interfered.