- multi_range_read_info_const now uses the new records_in_range interface
- Added handler::avg_io_cost()
- Don't calculate avg_io_cost() in get_sweep_read_cost if avg_io_cost is
not 1.0. In this case we trust the avg_io_cost() from the handler.
- Changed test_quick_select to use TIME_FOR_COMPARE instead of
TIME_FOR_COMPARE_IDX to align this with the rest of the code.
- Fixed bug when using test_if_cheaper_ordering where we didn't use
keyread if index was changed
- Fixed a bug where we didn't use index only read when using order-by-index
- Added keyread_time() to HEAP.
The default keyread_time() was optimized for blocks and not suitable for
HEAP. The effect was the HEAP prefered table scans over ranges for btree
indexes.
- Fixed get_sweep_read_cost() for HEAP tables
- Ensure that range and ref have same cost for simple ranges
Added a small cost (MULTI_RANGE_READ_SETUP_COST) to ranges to ensure
we favior ref for range for simple queries.
- Fixed that matching_candidates_in_table() uses same number of records
as the rest of the optimizer
- Added avg_io_cost() to JT_EQ_REF cost. This helps calculate the cost for
HEAP and temporary tables better. A few tests changed because of this.
- heap::read_time() and heap::keyread_time() adjusted to not add +1.
This was to ensure that handler::keyread_time() doesn't give
higher cost for heap tables than for normal tables. One effect of
this is that heap and derived tables stored in heap will prefer
key access as this is now regarded as cheap.
- Changed cost for index read in sql_select.cc to match
multi_range_read_info_const(). All index cost calculation is now
done trough one function.
- 'ref' will now use quick_cost for keys if it exists. This is done
so that for '=' ranges, 'ref' is prefered over 'range'.
- scan_time() now takes avg_io_costs() into account
- get_delayed_table_estimates() uses block_size and avg_io_cost()
- Removed default argument to test_if_order_by_key(); simplifies code
Remove usage of deprecated variable storage_engine. It was deprecated in 5.5 but
it never issued a deprecation warning. Make it issue a warning in 10.5.1.
Replaced with default_storage_engine.
row_drop_table_for_mysql(): If a #sql2 table is open in another
thread (purge) while we attempting to drop it, rename it to #sql-ib
name to hide it from the SQL layer.
Adjust tests accordingly to hide #sql-ib tables, which can
continue to exist until the background DROP TABLE completes.
This is joint work with Thirunarayanan Balathandayuthapani.
The MDL interface between InnoDB and the rest of the server
(in storage/innobase/dict/dict0dict.cc and in include/)
is my work, while most everything else is Thiru's.
The collection of InnoDB persistent statistics and the
defragmentation were not refactored to use MDL. They will
keep relying on lower-level interlocking with
fil_check_pending_operations().
The purge of transaction history and the background operations on
fulltext indexes will use MDL. We will revert
commit 2c4844c9e7
(MDEV-17813) because thanks to MDL, purge cannot conflict
with DDL operations anymore. For a similar reason, we will remove
the MDEV-16222 test case from gcol.innodb_virtual_debug_purge.
Purge is essentially replacing all use of the global dict_sys.latch
with MDL. Purge will skip the undo log records for tables whose names
start with #sql-ib or #sql2. Theoretically, such tables might
be renamed back to visible table names if TRUNCATE fails to
create a new table, or the final rename in ALTER TABLE...ALGORITHM=COPY
fails. In that case, purge could permanently leave some garbage
in the table. Such garbage will be tolerated; the table would not
be considered corrupted.
To avoid repeated MDL releases and acquisitions,
trx_purge_attach_undo_recs() will sort undo log records by table_id,
and purge_node_t will keep the MDL and table handle open for multiple
successive undo log records.
get_purge_table(): A new accessor, used during the purge of
history for indexed virtual columns. This interface should ideally
not exist at all.
thd_mdl_context(): Accessor of THD::mdl_context.
Wrapped in a new thd_mdl_service.
dict_get_db_name_len(): Define inline.
dict_acquire_mdl_shared(): Acquire explicit shared MDL on a table name
if needed.
dict_table_open_on_id(): Return MDL_ticket, if requested.
dict_table_close(): Release MDL ticket, if requested.
dict_fts_index_syncing(), dict_index_t::index_fts_syncing: Remove.
row_drop_table_for_mysql() no longer needs to check these, because
MDL guarantees that a fulltext index sync will not be in progress
while MDL_EXCLUSIVE is protecting a DDL operation.
dict_table_t::parse_name(): Parse the table name for acquiring MDL.
purge_node_t::undo_recs: Change the type to std::list<trx_purge_rec_t*>
(different container, and storing also roll_ptr).
purge_node_t: Add mdl_ticket, last_table_id, purge_thd, mdl_hold_recs
for acquiring MDL and for keeping the table open across multiple
undo log records.
purge_vcol_info_t, row_purge_store_vsec_cur(), row_purge_restore_vsec_cur():
Remove. We will acquire the MDL earlier.
purge_sys_t::heap: Added, for reading undo log records.
fts_sync_during_ddl(): Invoked during ALGORITHM=INPLACE operations
to ensure that fts_sync_table() will not conflict with MDL_EXCLUSIVE.
Uses fts_t::sync_message for bookkeeping.
Add support of referential constraints directly in column defininions:
create table t1 (id1 int primary key);
create table t2 (id2 int references t1(id1));
Referenced field name can be omitted if equal to foreign field name:
create table t1 (id int primary key);
create table t2 (id int references t1);
Until 10.5 this syntax was understood by the parser but was silently
ignored.
In case of generated columns this syntax is disabled at parser level
by ER_PARSE_ERROR. Note that separate FOREIGN KEY clause for generated
columns is disabled at storage engine level.
Almost all threads have gone
- the "ticking" threads, that sleep a while then do some work)
(srv_monitor_thread, srv_error_monitor_thread, srv_master_thread)
were replaced with timers. Some timers are periodic,
e.g the "master" timer.
- The btr_defragment_thread is also replaced by a timer , which
reschedules it self when current defragment "item" needs throttling
- the buf_resize_thread and buf_dump_threads are substitutes with tasks
Ditto with page cleaner workers.
- purge workers threads are not tasks as well, and purge cleaner
coordinator is a combination of a task and timer.
- All AIO is outsourced to tpool, Innodb just calls thread_pool::submit_io()
and provides the callback.
- The srv_slot_t was removed, and innodb_debug_sync used in purge
is currently not working, and needs reimplementation.
Conversion to a temporal data type resulting into a lower precision
depends on TIME_ROUND_FRACTIONAL. Taking into account this dependency in:
- indexed generated virtual column expressions
- persistent virtual column expressions
A warning is now issued if conversion from the generation expression
to the column data type depends on TIME_ROUND_FRACTIONAL.
The warning will be changed to error in 10.5
This change takes into account a column's GENERATED ALWAYS AS
expression dependcy on sql_mode's PAD_CHAR_TO_FULL_LENGTH and
NO_UNSIGNED_SUBTRACTION flags.
Indexed virtual columns as well as persistent generated columns are
now not allowed to have such dependencies to avoid inconsistent data
or index files on sql_mode changes.
So an error is now returned in cases like this:
CREATE OR REPLACE TABLE t1
(
a CHAR(5),
v VARCHAR(5) AS (a) PERSISTENT -- CHAR->VARCHAR or CHAR->TEXT = ERROR
);
Functions RPAD() and RTRIM() can now remove dependency on
PAD_CHAR_TO_FULL_LENGTH. So this can be used instead:
CREATE OR REPLACE TABLE t1
(
a CHAR(5),
v VARCHAR(5) AS (RTRIM(a)) PERSISTENT
);
Note, unlike CHAR->VARCHAR and CHAR->TEXT this still works,
not RPAD(a) is needed:
CREATE OR REPLACE TABLE t1
(
a CHAR(5),
v CHAR(5) AS (a) PERSISTENT -- CHAR->CHAR is OK
);
More sql_mode flags may affect values of generated columns.
They will be addressed separately.
See comments in sql_mode.h for implementation details.
This allows one to run the test suite even if any of the following
options are changed:
- character-set-server
- collation-server
- join-cache-level
- log-basename
- max-allowed-packet
- optimizer-switch
- query-cache-size and query-cache-type
- skip-name-resolve
- table-definition-cache
- table-open-cache
- Some innodb options
etc
Changes:
- Don't print out the value of system variables as one can't depend on
them to being constants.
- Don't set global variables to 'default' as the default may not
be the same as the test was started with if there was an additional
option file. Instead save original value and reset it at end of test.
- Test that depends on the latin1 character set should include
default_charset.inc or set the character set to latin1
- Test that depends on the original optimizer switch, should include
default_optimizer_switch.inc
- Test that depends on the value of a specific system variable should
set it in the test (like optimizer_use_condition_selectivity)
- Split subselect3.test into subselect3.test and subselect3.inc to
make it easier to set and reset system variables.
- Added .opt files for test that required specfic options that could
be changed by external configuration files.
- Fixed result files in rockdsb & tokudb that had not been updated for
a while.
(Backported to 10.3, addressed review input)
Sj_materialization_picker::check_qep(): fix error in cost/fanout
calculations:
- for each join prefix, add #prefix_rows / TIME_FOR_COMPARE to the cost,
like best_extension_by_limited_search does
- Remove the fanout produced by the subquery tables.
- Also take into account join condition selectivity
optimize_wo_join_buffering() (used by LooseScan and FirstMatch)
- also add #prefix_rows / TIME_FOR_COMPARE to the cost of each prefix.
- Also take into account join condition selectivity
Reason for the change was that ha_notify_table_changed() was done
after table open when .frm had been replaced, which caused failure
in engines that checks on open if .frm matches the engines table
definition.
Other changes:
- Remove not needed open/close call at end of inline alter table.
Some test that depended on the table beeing in the table cache after
ALTER TABLE had to be updated.
row_purge_upd_exist_or_extern_func(): Check for node->vcol_op_failed()
after row_purge_remove_sec_if_poss(), like row_purge_del_mark() did.
This avoids us dereferencing the node->table=NULL pointer.
The test case, submitted by Elena Stepanova, is not deterministic and
does not repeat the bug on 10.2. With the added loop, for me, it reliably
crashes 10.3 without the fix. I was unable to create a deterministic
test case for either 10.2 or 10.3.
Reviewed by Thirunarayanan Balathandayuthapani
Condition can be pushed from the HAVING clause into the WHERE clause
if it depends only on the fields that are used in the GROUP BY list
or depends on the fields that are equal to grouping fields.
Aggregate functions can't be pushed down.
How the pushdown is performed on the example:
SELECT t1.a,MAX(t1.b)
FROM t1
GROUP BY t1.a
HAVING (t1.a>2) AND (MAX(c)>12);
=>
SELECT t1.a,MAX(t1.b)
FROM t1
WHERE (t1.a>2)
GROUP BY t1.a
HAVING (MAX(c)>12);
The implementation scheme:
1. Extract the most restrictive condition cond from the HAVING clause of
the select that depends only on the fields that are used in the GROUP BY
list of the select (directly or indirectly through equalities)
2. Save cond as a condition that can be pushed into the WHERE clause
of the select
3. Remove cond from the HAVING clause if it is possible
The optimization is implemented in the function
st_select_lex::pushdown_from_having_into_where().
New test file having_cond_pushdown.test is created.