This allows a user to to change the default value of MAX_SEL_ARGS (16000)
in the rare case where they neeed more generated SEL_ARGS (as part of
the range optimizer)
Raise notes if indexes cannot be used:
- in case of data type or collation mismatch (diferent error messages).
- in case if a table field was replaced to something else
(e.g. Item_func_conv_charset) during a condition rewrite.
Added option to write warnings and notes to the slow query log for
slow queries.
New variables added/changed:
- note_verbosity, with is a set of the following options:
basic - All old notes
unusable_keys - Print warnings about keys that cannot be used
for select, delete or update.
explain - Print unusable_keys warnings for EXPLAIN querys.
The default is 'basic,explain'. This means that for old installations
the only notable new behavior is that one will get notes about
unusable keys when one does an EXPLAIN for a query. One can turn all
of all notes by either setting note_verbosity to "" or setting sql_notes=0.
- log_slow_verbosity has a new option 'warnings'. If this is set
then warnings and notes generated are printed in the slow query log
(up to log_slow_max_warnings times per statement).
- log_slow_max_warnings - Max number of warnings written to
slow query log.
Other things:
- One can now use =ALL for any 'set' variable to set all options at once.
For example using "note_verbosity=ALL" in a config file or
"SET @@note_verbosity=ALL' in SQL.
- mysqldump will in the future use @@note_verbosity=""' instead of
@sql_notes=0 to disable notes.
- Added "enum class Data_type_compatibility" and changing the return type
of all Field::can_optimize*() methods from "bool" to this new data type.
Reviewer & Co-author: Alexander Barkov <bar@mariadb.com>
- The code that prints out the notes comes mainly from Alexander
don't construct open ranges from prefix blob keys for < (less than)
just as it's already done for > (greater than)
because prefix KEY_PART doesn't create prefix Field for blobs
(see open_table_from_share() near "Create a new field for the key part"),
so stored_field_cmp_to_item() will compare the original field to the
value not taking the prefix length into account.
LooseScan code set opt_range_condition_rows to be the
MIN(loose_scan_plan->records, table->records)
totally ignoring possible quick range selects. If there was a quick
select $QUICK on another index with
$QUICK->records < loose_scan_plan->records
this would create a situation where
opt_range_condition_rows > $QUICK->records
which causes an assert in 10.6+ and potentially wrong query plan
choice in 10.5.
Fixed by making opt_range_condition_rows to be the minimum #rows
of any quick select.
Approved-by: Monty <monty@mariadb.org>
A GROUP BY query which uses "MIN(pk)" and has "pk<>const" in the
WHERE clause would produce wrong result when handled with "Using index
for group-by". Here "pk" column is the table's primary key.
The problem was introduced by fix for MDEV-23634. It made the range
optimizer to not produce ranges for conditions in form "pk != const".
However, LooseScan code requires that the optimizer is able to
convert the condition on the MIN/MAX column into an equivalent range.
The range is used to locate the row that has the MIN/MAX value.
LooseScan checks this in check_group_min_max_predicates(). This fix
makes the code in that function to take into account that "pk != const"
does not produce a range.
This bug could affect multi-update statements as well as single-table
update statements processed as multi-updates when the where condition
contained a range condition over a non-indexed varchar column. The
optimizer calculates selectivity of such range conditions using histograms.
For each range the buckets containing endpoints of the the range are
determined with a procedure that stores the values of the endpoints in the
space of the record buffer where values of the columns are usually stored.
For a range over a varchar column the value of a endpoint may exceed the
size of the buffer and in such case the value is stored with truncation.
This truncations cannot affect the result of the calculation of the range
selectivity as the calculation employes only the beginning of the value
string. However it can trigger generation of an unexpected error on this
truncation if an update statement is processed.
This patch prohibits truncation messages when selectivity of a range
condition is calculated for a non-indexed column.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
This patch is the result of running
run-clang-tidy -fix -header-filter=.* -checks='-*,modernize-use-equals-default' .
Code style changes have been done on top. The result of this change
leads to the following improvements:
1. Binary size reduction.
* For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by
~400kb.
* A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb.
2. Compiler can better understand the intent of the code, thus it leads
to more optimization possibilities. Additionally it enabled detecting
unused variables that had an empty default constructor but not marked
so explicitly.
Particular change required following this patch in sql/opt_range.cc
result_keys, an unused template class Bitmap now correctly issues
unused variable warnings.
Setting Bitmap template class constructor to default allows the compiler
to identify that there are no side-effects when instantiating the class.
Previously the compiler could not issue the warning as it assumed Bitmap
class (being a template) would not be performing a NO-OP for its default
constructor. This prevented the "unused variable warning".
This issue was caused by the bug fix for
MDEV-30325 Wrong result upon range query using index condition
The bug could happen in the case of several overlapping key ranges
with OR
This was caused by a bug in key_or() when SEL_ARG* key1 has been cloned
and is overlapping with SEL_ARG *key2
Cloning of SEL_ARG's happens only in very special cases, which is why this
bug has remained undetected for years.
It happend in the following query:
SELECT COUNT(*) FROM lineitem force index
(i_l_orderkey_quantity,i_l_shipdate) WHERE
l_shipdate < '1994-01-01' AND l_orderkey < 800 OR
l_quantity > 3 AND l_orderkey NOT IN ( 157, 1444 );
Because there are two different indexes that can be used and the code for
IN causes a 'tree_or', which causes all SEL_ARG's to be cloned.
Other things:
- While checking the code, I found a bug in SEL_ARG::SEL_ARG(SEL_ARG &arg)
- This was incrementing next_key_part->use_count as part of creating a
copy of an existing SEL_ARG.
This is however not enough as the 'reverse operation' when the copy is
not needed is 'key2_cpy.increment_use_count(-1)', which does something
completely different.
Fixed by calling increment_use_count(1) in SEL_ARG::SEL_ARG.
Make SEL_ARG::make_root() maintain SEL_ARG::weight.
Also, an unrelated change: fix dbug_print_sel_arg() to correctly
print SQL NULL for the right endpoint.
(addressed review input)
The issue was introduced by @@optimizer_max_sel_arg_weight code.
key_or() calls SEL_ARG::update_weight_locally(). That function
takes O(tree->elements) time.
Without that call, key_or(big_tree, one_element_tree) would take
O(log(big_tree)) when one_element_tree doesn't overlap with elements
of big_tree.
This means, update_weight_locally() can cause a big slowdown.
The fix:
1. key_or() actually doesn't need to call update_weight_locally().
It calls SEL_ARG::tree_delete() and SEL_ARG::insert(). These functions
update SEL_ARG::weight.
It also manipulates the SEL_ARG objects directly, but these
modifications do not change the weight of the tree.
I've just removed the update_weight_locally() call.
2. and_all_keys() also calls update_weight_locally(). It manipulates the
SEL_ARG graph directly.
Removed that call and added the code to update the SEL_ARG graph weight.
Tests main.range and main.range_not_embedded already contain the queries
that have test coverage for the affected code.
The problem was that get_best_group_min_max() did not check if fields used
by the "group_min_max optimization" where used in sub queries.
Because of this, it did not detect that a key (b,a) was used in the WHERE
clause for the statement:
SELECT DISTINCT b FROM t1 WHERE EXISTS ( SELECT 1 FROM DUAL WHERE a > 1 ).
Fixed by also traversing the sub queries when checking if a field is used.
This disables group_min_max_optimization for the above query.
Reviewer: Sergei Petrunia
The issue was that calc_cond_selectivity_for_table prefered ranges with
many parts and when deciding on which selectivity to use.
Fixed by going through ranges according to the number of rows in the range.
This ensures that selectivity from ranges with few rows will be prefered
over ranges with many rows for indexes that uses the same columns.
Federated and Federatex cannot be used with ROR scans
Federated::position() and Federatex::position() is storing in 'ref' a
pointer into a local result set buffer. This means that one cannot
compare 'ref' from different handler instances to see if they point to the
same physical record.
This bug caused federated.federatedx to return wrong results when the
optimizer tried to use index_merge to resolve some queries.
Fixed by introducing table flag HA_NON_COMPARABLE_ROWID and using this
with the above handlers.
Todo:
- Fix multi_delete(), multi_update and read_records() to use primary key
instead of 'ref' if case HA_NON_COMPARABLE_ROWID is set. The current
code only works if we have only one range (like table scan) for the
tables that will be updated in the second pass.
- Enable DBUG_ASSERT() in ha_federated::cmp_ref() and
ha_federatedx::cmp_ref().
If when extracting a range condition for an index from the WHERE condition
Range Optimizer sees that the range condition covers the whole index then
such condition should be discarded because it cannot be used in any range
scan. In some cases Range Optimizer really does it, but there remained some
conditions for which it was not done. As a result the optimizer could
produce index merge plans with the full index scan for one of the indexes
participating in the index merge.
This could be observed in one of the test cases from index_merge1.inc
where a plan with index_merge_sort_union was produced and in the test case
reported for this bug where a plan with index_merge_sort_intersect was
produced. In both cases one of two index scans participating in index merge
ran over the whole index.
The patch slightly changes the original above mentioned test case from
index_merge1.inc to be able to produce an intended plan employing
index_merge_sort_union. The original query was left to show that index
merge is not used for it anymore.
It should be noted that for the plan with index_merge_sort_intersect could
be chosen for execution only due to a defect in the InnoDB code that
returns wrong estimates for the cardinality of big ranges.
This bug led to serious problems in 10.4+ where the optimization using
Rowid filters is employed (see mdev-26446).
Approved by Sergey Petrunia <sergey@mariadb.com>
Changes:
- To detect automatic strlen() I removed the methods in String that
uses 'const char *' without a length:
- String::append(const char*)
- Binary_string(const char *str)
- String(const char *str, CHARSET_INFO *cs)
- append_for_single_quote(const char *)
All usage of append(const char*) is changed to either use
String::append(char), String::append(const char*, size_t length) or
String::append(LEX_CSTRING)
- Added STRING_WITH_LEN() around constant string arguments to
String::append()
- Added overflow argument to escape_string_for_mysql() and
escape_quotes_for_mysql() instead of returning (size_t) -1 on overflow.
This was needed as most usage of the above functions never tested the
result for -1 and would have given wrong results or crashes in case
of overflows.
- Added Item_func_or_sum::func_name_cstring(), which returns LEX_CSTRING.
Changed all Item_func::func_name()'s to func_name_cstring()'s.
The old Item_func_or_sum::func_name() is now an inline function that
returns func_name_cstring().str.
- Changed Item::mode_name() and Item::func_name_ext() to return
LEX_CSTRING.
- Changed for some functions the name argument from const char * to
to const LEX_CSTRING &:
- Item::Item_func_fix_attributes()
- Item::check_type_...()
- Type_std_attributes::agg_item_collations()
- Type_std_attributes::agg_item_set_converter()
- Type_std_attributes::agg_arg_charsets...()
- Type_handler_hybrid_field_type::aggregate_for_result()
- Type_handler_geometry::check_type_geom_or_binary()
- Type_handler::Item_func_or_sum_illegal_param()
- Predicant_to_list_comparator::add_value_skip_null()
- Predicant_to_list_comparator::add_value()
- cmp_item_row::prepare_comparators()
- cmp_item_row::aggregate_row_elements_for_comparison()
- Cursor_ref::print_func()
- Removes String_space() as it was only used in one cases and that
could be simplified to not use String_space(), thanks to the fixed
my_vsnprintf().
- Added some const LEX_CSTRING's for common strings:
- NULL_clex_str, DATA_clex_str, INDEX_clex_str.
- Changed primary_key_name to a LEX_CSTRING
- Renamed String::set_quick() to String::set_buffer_if_not_allocated() to
clarify what the function really does.
- Rename of protocol function:
bool store(const char *from, CHARSET_INFO *cs) to
bool store_string_or_null(const char *from, CHARSET_INFO *cs).
This was done to both clarify the difference between this 'store' function
and also to make it easier to find unoptimal usage of store() calls.
- Added Protocol::store(const LEX_CSTRING*, CHARSET_INFO*)
- Changed some 'const char*' arrays to instead be of type LEX_CSTRING.
- class Item_func_units now used LEX_CSTRING for name.
Other things:
- Fixed a bug in mysql.cc:construct_prompt() where a wrong escape character
in the prompt would cause some part of the prompt to be duplicated.
- Fixed a lot of instances where the length of the argument to
append is known or easily obtain but was not used.
- Removed some not needed 'virtual' definition for functions that was
inherited from the parent. I added override to these.
- Fixed Ordered_key::print() to preallocate needed buffer. Old code could
case memory overruns.
- Simplified some loops when adding char * to a String with delimiters.
Handle "col<>const" in the same way that MDEV-21958 did for
"col NOT IN(const-list)": do not use the condition for range/index_merge
accesses if there is a unique UNIQUE KEY(col).
The testcase is in main/range.test. The rest of test updates are
due to widespread use of 'pk<>1' in the testsuite. Changed the test
to use different but equivalent forms of the conditions.