This patch also fixes
MDEV-31391 Assertion `((best.records_out) == 0.0 ... failed
Cost changes caused by this change:
- range queries with join buffer now have a notable smaller cost.
- range ranges are bit more expensive as the MULTI_RANGE_COST is now
properly applied to it in all cases (this extra cost is equal to a
key lookup).
- table scan cost is slight smaller as we now assume data is cached in
the engine after the first scan pass. (We did this before for range
scans and other access methods).
- partition tables had wrong values for max_row_blocks and
max_index_blocks. Correcting this, causes range access on
partitioned tables to have slightly higher cost because of the
increased estimated IO.
- Using first match + join buffer caused 'filtered' to be calcualted
wrong. (Only affected EXPLAIN, not query costs).
- Added cost_without_join_buffer to optimizer_trace.
- check_quick_select() adjusted the number of rows according to persistent
statistics, but did not adjust cost. Now fixed.
The big change in the patch are:
- In best_access_path(), where we now are using storing the cost in
'ALL_READ_COST cost' and only converting it to a double at the end.
This allows us to more exactly calculate the effect of the join_cache.
- In JOIN_TAB::estimate_scan_time(), store the cost also in a
ALL_READ_COST object.
One of effect if this change is that when joining very small tables:
t1 some_access_method
t2 range
t3 ALL Use join buffer
This is swiched to
t1 some_access_method
t3 ALL
t2 range use join buffer
Both plans has the same cost, but as table scan in this case has less
cost than rang, the table scan will be considered first and thus have
precidence.
Test case changes:
- optimizer_trace - Addition of cost_without_join_buffer
- subselect_mat_cost_bugs - Small tables and scan versus range
- range & range_mrr_icp - Range + join_cache is faster than ref
- optimizer_trace - cost_without_join_buffer, smaller scan cost,
range setup cost.
- mrr - range+join_buffer used as smaller cost
This is 11.0 part of the fix: in 11.0, get_costs_for_tables() calls
best_access_path() for all possible tables, for each call it saves
a POSITION object with the access method and "loose_scan_pos"
POSITION object.
The latter is saved even if there is no possible LooseScan plan. Saving
is done by copying POSITION objects which may generate a spurious
UBSan error.
The problem was a wrong assert. I changed it to match the code in
best_access_path().
The given test case was a bit tricky for the optimizer, which first
decided on using a index scan (because of force index), but then
test_if_skip_sort_order() decided to use range anyway to handle
distinct.
This was caused of two minor issues:
- get_quick_record_count() returned the number of rows for range with
least cost, when it should have returned the minum number of rows
for any range.
- When changing REF to RANGE, we also changed records_out, which
should not be done (number of records in the result will not
change).
The above change can cause a small change in row estimates where the
optimizer chooses a clustered key with more rows than a range one
secondary key (unlikely case).
The code in create_internal_tmp_table() didn't take into account that
now temporary (derived) tables may have multiple indexes:
- one index due to duplicate removal
= In this example created by conversion of big-IN(...) into subquery
= this index might be converted into a "unique constraint" if the key
length is too large.
- one index added by derived_with_keys optimization.
Make create_internal_tmp_table() handle multiple indexes.
Before this patch, use of a unique constraint was indicated in
TABLE_SHARE::uniques. This was ok as unique constraint was the only index
in the table. Now it's no longer the case so TABLE_SHARE::uniques is removed
and replaced with an in-memory-only flag HA_UNIQUE_HASH.
This patch is based on Monty's patch.
Co-Author: Monty <monty@mariadb.org>
This makes it easier to run test on just the MRG_MYISAM engine.
Other things:
- Renamed merge-big.test to flush_corruption.test as it does not
have anything to do with merge tables.
The problem was that for merge tables without any underlaying tables the
block size was 0, which the code could not handle.
Fixed by ensuring that MERGE tables never have a block size of 0.
Fixed also a wrong assumption of number of seeks needed to scan
merge tables.
Introduces @@optimizer_switch flag: hash_join_cardinality
When this option is on, use EITS statistics to produce tighter bounds
for hash join output cardinality.
This patch is an extension / replacement to a similar patch in 10.6
New features:
- optimizer_switch hash_join_cardinality is on by default
- records_out is set to fanout when HASH is used
- Fixed bug in is_eits_usable: The function did not work with views
rtr_ins_enlarge_mbr(): Relax the assertion that was added in
commit f27e9c8947
to cover also table-rebuilding DDL. We only need
btr_cur->rtr_info->thr for acquiring R-tree locks,
and the thread that is executing a DDL operation has exclusive
access to the being-built index or the copy of the being-rebuilt table.
Set the lock wait timeout to 1 beforehand, and reset it afterwards, to
avoid lock conflict caused by opening the same table twice in case of
self-reference.
Extracted out common subroutines, gave more meaningful names etc,
added comments etc.
Also:
- Documented active servers load balancing reads, and other fields in
SPIDER_SHARE etc.
- Removed commented out code
- Documented and refactored self-reference check
- Removed some unnecessary functions
- Renamed unhelpful roop_count
- Refactored spider_get_{sts,crd}, where we turn get_type into an enum
- Cleaned up spider_mbase_handler::show_table_status() and
spider_mbase_handler::show_index()
Deb-autobakes where failing with:
dh_missing: warning: usr/share/mysql/mysql-test/mysql-test-run exists in debian/tmp but is not installed to anywhere
dh_missing: warning: usr/share/mysql/mysql-test/mtr exists in debian/tmp but is not installed to anywhere
dh_missing: warning: usr/share/mysql/mysql-test/mariadb-test-run exists in debian/tmp but is not installed to anywhere
dh_missing: warning: usr/share/mysql/mysql-test/mysql-test-run.pl exists in debian/tmp but is not installed to anywhere
Add all to mariadb-test.install and remove mariadb-test.links
AWK in used in Debian SysV-init and postinst scripts to determine
is there enough space starting MariaDB database or create new
database to target destination.
These AWK scripts can be rewrited to use pure SH or help
using Coreutils which is mandatory for usage of MariaDB currently.
Reasoning behind this is to get rid of one very less used dependency
make the compile-time logic in my_timer_cycles() also #define
MY_TIMER_ROUTINE_CYCLES to indicate which implementation it is using.
Then, make my_timer_init() use MY_TIMER_ROUTINE_CYCLES.
This leaves us with just one set of compile-time #if's which determine
how we read time in #cycles.
Reviewer (and commit message author): Sergei Petrunia <sergey@mariadb.com>
log_free_check(): Assert that the caller must not hold
exclusive lock_sys.latch. This was the case for calls from
ibuf_delete_for_discarded_space(). This caused a deadlock with
another thread that would be holding a latch on a dirty page
that would need to be written so that the checkpoint would advance
and log_free_check() could return. That other thread was waiting
for a shared lock_sys.latch.
fil_delete_tablespace(): Do not invoke ibuf_delete_for_discarded_space()
because in DDL operations, we will be holding exclusive lock_sys.latch.
trx_t::commit(std::vector<pfs_os_file_t>&), innodb_drop_database(),
row_purge_remove_clust_if_poss_low(), row_undo_ins_remove_clust_rec(),
row_discard_tablespace_for_mysql():
Invoke ibuf_delete_for_discarded_space() on the deleted tablespaces after
releasing all latches.
page_cleaner_flush_pages_recommendation(): If dirty_pct is
between innodb_max_dirty_pages_pct_lwm
and innodb_max_dirty_pages_pct,
scale the effort relative to how close we are to
innodb_max_dirty_pages_pct.
The previous formula was missing a multiplication by 100.
Tested by: Axel Schwenke
This addition to MDEV-30804 is relevant for 10.6+, it excludes
the mixed transaction section using both innodb and aria storage
engines from the galera_var_replicate_aria_off test, since such
transactions cannot be executed unless aria supports two-phase
transaction commit. No additional tests are required as this
commit fixes the mtr test itself.
The error was seen by a number of mtr tests being caused
by overdue initialization of rpl_parallel::LOCK_parallel_entry.
Specifically, SHOW-SLAVE-STATUS might find in
rpl_parallel::workers_idle() a gtid domain hash entry
already inserted whose mutex had not done
mysql_mutex_init().
Fixed with swapping the mutex init and the its entry's stack insertion.
Tested with a generous number of `mtr --repeat` of a few of the reported
to fail tests, incl rpl.parallel_backup.
buf_pool_t::page_cleaner_wakeup(): If for_LRU=true, wake up the page
cleaner immediately, also when it is in a timed wait. This avoids an
unnecessary delay of up to 1 second.
When commit a5a2ef079c
implemented asynchronous doublewrite, the writes via
the doublewrite buffer started to be counted incorrectly,
without multiplying them by innodb_page_size.
srv_export_innodb_status(): Correctly count the
Innodb_data_written.
buf_dblwr_t: Remove submitted(), because it is close to written()
and only Innodb_data_written was interested in it. According to
its name, it should count completed and not submitted writes.
Tested by: Axel Schwenke