The issue was that when limit is used,
SQL_SELECT::test_quick_select would set the cost of table scan to be
unreasonable high to force a range to be used.
The problem with this approach was that range was used even when the
cost of range, when it would only read 'limit rows' would be higher
than the cost of a table scan.
This patch fixes it by not accepting ranges when the range can never
have a lower cost than a table scan, even if every row would match the
WHERE clause.
Part#2: cleanup:
In the part 1 of the fix, DS-MRR implementation would peek into
the JOIN_TAB to get the rowid filter from
table->reginfo.join_tab->rowid_filter
This doesn't look good from code isolation standpoint (why should a
storage engine assume it is used through a JOIN_TAB?).
Fixed this by storing the 'un-pushed' rowid_filter in the DsMrr_impl
structure. The filter survives across multi_range_read_init() calls.
It is discarded when somebody calls index_end() or rnd_end() and cleans
up the DsMrr_impl.
This patch fixes the following defects/bugs.
1. If BKA[H] algorithm was used to join a table for which the optimizer
had decided to employ a rowid filter the filter actually was not built.
2. The patch for the bug MDEV-21356 that added the code canceling pushing
rowid filter into an engine for the table joined with employment of
BKA[H] and MRR was not quite correct for Innodb engine because this
cancellation was done after InnoDB code had already bound the the pushed
filter to internal InnoDB structures.
Fix partitioning and DS-MRR to work together
- In ha_partition::index_end(): take into account that ha_innobase (and
other engines using DS-MRR) will have inited=RND when initialized for
DS-MRR scan.
- In ha_partition::multi_range_read_next(): if the MRR scan is using
HA_MRR_NO_ASSOCIATION mode, it is not guaranteed that the partition's
handler will store anything into *range_info.
- In DsMrr_impl::choose_mrr_impl(): ha_partition will inquire partitions
about how much memory their MRR implementation needs by passing
*buffer_size=0. DS-MRR code didn't know about this (actually it used
uint for buffer size calculation and would have an under-flow).
Returning *buffer_size=0 made ha_partition assume that partitions do
not need MRR memory and pass the same buffer to each of them.
Now, this is fixed. If DS-MRR gets *buffer_size=0, it will return
the amount of buffer space needed, but not more than about
@@mrr_buffer_size.
* Fix ha_{innobase,maria,myisam}::clone. If ha_partition uses MRR on its
partitions, and partition use DS-MRR, the code will call handler->clone
with TABLE (*NOT partition*) name as an argument.
DS-MRR has no way of knowing the partition name, so the solution was
to have the ::clone() function for the affected storage engine to ignore
the name argument and get it elsewhere.
This commit is based on the work of Michal Schorm, rebased on the
earliest MariaDB version.
Th command line used to generate this diff was:
find ./ -type f \
-exec sed -i -e 's/Foundation, Inc., 59 Temple Place, Suite 330, Boston, /Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, /g' {} \; \
-exec sed -i -e 's/Foundation, Inc. 59 Temple Place.* Suite 330, Boston, /Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, /g' {} \; \
-exec sed -i -e 's/MA.*.....-1307.*USA/MA 02110-1335 USA/g' {} \; \
-exec sed -i -e 's/Foundation, Inc., 59 Temple/Foundation, Inc., 51 Franklin/g' {} \; \
-exec sed -i -e 's/Place, Suite 330, Boston, MA.*02111-1307.*USA/Street, Fifth Floor, Boston, MA 02110-1335 USA/g' {} \; \
-exec sed -i -e 's/MA.*.....-1307/MA 02110-1335/g' {} \;
The XtraDB storage engine was already replaced by InnoDB
and disabled in MariaDB Server 10.2. Let us remove it altogether
to avoid dragging dead code around.
Replace some references to XtraDB with references to InnoDB.
rpl_get_position_info(): Remove.
Remove the mysql-test-run --suite=percona, because it only contains
tests specific to XtraDB, many of which were disabled already in
earlier MariaDB versions.
(Backport to 5.3)
(variant #2, with fixed coding style)
- Make Mrr_ordered_index_reader::resume_read() restore index position
only if it was saved before with Mrr_ordered_index_reader::interrupt_read().
- The crash was caused because the optimizer called handler->multi_range_read_info()
on a derived temporary table. That table has been created, but not opened yet.
Because of that, handler::table was NULL, which caused crash.
Fixed by changing DS-MRR methods to use handler::table_share instead.
handler::table_share is set in handler ctor, so this should be safe.
- "Using MRR" is no longer shown with range access.
- Instead, both range and BKA accesses will show one of the following:
= "Rowid-ordered scan"
= "Key-ordered scan"
= "Key-ordered Rowid-ordered scan"
depending on whether DS-MRR implementation will do scan keys in order, rowids in order,
or both.
- The patch also introduces a way for other storage engines/MRR implementations to
pass information to EXPLAIN output about the properties of employed MRR scans.
- The problem was that Mrr_ordered_index_reader's interrupt_read() and resume_read() would
save and restore 1) index tuple 2) the rowid (as bytes returned by handler->position()). Clustered
primary key columns were not saved/restored.
They are not explicitly present in the index tuple (i.e. table->key_info[secondary_key].key_parts
doesn't list them), but they are actually there, in particular
table->field[clustered_primary_key_member].part_of_key(secondary_key) == 1. Index condition pushdown
code [correctly] uses the latter as inidication that pushed index condition can refer to clustered PK
members.
The fix was to make interrupt_read()/resume_read() to save/restore clustered primary key members as well,
so that we get correct values for them when evaluating pushed index condition.
[3rd attempt: remove the debugging aids, fix comments in testcase]
Switch from "Disable identical key handling optimization when
IndexConditionPushdown is used" approach
To
an approach where we save/restore index tuple and so can use index condition pushdown.
- Make Mrr_ordered_index_reader() save the rowid across scan interruptions
Also
- Fix compiler warning for setup_buffer_sizes()
- Add commented key_copy/key_restore for better handling of a similar issue
with index record being destroyed by scan interruption (which causes
incorrect evaluation of pushed index condition later on).