MDEV-30668 Set function aggregated in outer select used in view definition
This patch fixes two bugs concerning views whose specifications contain
subqueries with set functions aggregated in outer selects.
Due to the first bug those such views that have implicit grouping were
considered as mergeable. This led to wrong result sets for selects from
these views.
Due to the second bug the aggregation select was determined incorrectly and
this led to bogus error messages.
The patch added several test cases for these two bugs and for four other
duplicate bugs.
The patch also enables view-protocol for many other test cases.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
st_select_lex::init_query is called in the exectuion of EXECUTE
IMMEDIATE 'alter table ...'. so reset the initialization at the
same point we set join= 0.
The problem was caused by use of COLLATION(AVG('x')). This is an
item whose value is a constant.
Name Resolution code called convert_const_to_int() which removed AVG('x').
However, the item representing COLLATION(...) still had with_sum_func=1.
This inconsistent state confused the code that handles grouping and
DISTINCT: JOIN::get_best_combination() decided to use one temporary
table and allocated one JOIN_TAB for it, but then
JOIN::make_aggr_tables_info() attempted to use two and made writes
beyond the end of the JOIN::join_tab array.
The fix:
- Do not replace constant expressions which contain aggregate functions.
- Add JOIN::dbug_join_tab_array_size to catch attempts to use more
JOIN_TAB objects than we've allocated.
Handle "col<>const" in the same way that MDEV-21958 did for
"col NOT IN(const-list)": do not use the condition for range/index_merge
accesses if there is a unique UNIQUE KEY(col).
The testcase is in main/range.test. The rest of test updates are
due to widespread use of 'pk<>1' in the testsuite. Changed the test
to use different but equivalent forms of the conditions.
Bit operators (~ ^ | & << >>) and the function BIT_COUNT()
always called val_int() for their arguments.
It worked correctly only for INT type arguments.
In case of DECIMAL and DOUBLE arguments it did not work well:
the argument values were truncated to the maximum SIGNED BIGINT value
of 9223372036854775807.
Fixing the code as follows:
- If the argument if of an integer data type,
it works using val_int() as before.
- If the argument if of some other data type, it gets the argument value
using val_decimal(), to avoid truncation, and then converts the result
to ulonglong.
Using Item_handled_func to switch between the two approaches easier.
As an additional advantage, with Item_handled_func it will be easier
to implement overloading in the future, so data type plugings will be able
to define their own behavioir of bit operators and BIT_COUNT().
Moving the code from the former val_int() implementations
as methods to Longlong_null, to avoid code duplication in the
INT and DECIMAL branches.
In main.index_merge_myisam we remove the test that was added in
commit a2d24def8c because
it duplicates the test case that was added in
commit 5af12e4635.
- multi_range_read_info_const now uses the new records_in_range interface
- Added handler::avg_io_cost()
- Don't calculate avg_io_cost() in get_sweep_read_cost if avg_io_cost is
not 1.0. In this case we trust the avg_io_cost() from the handler.
- Changed test_quick_select to use TIME_FOR_COMPARE instead of
TIME_FOR_COMPARE_IDX to align this with the rest of the code.
- Fixed bug when using test_if_cheaper_ordering where we didn't use
keyread if index was changed
- Fixed a bug where we didn't use index only read when using order-by-index
- Added keyread_time() to HEAP.
The default keyread_time() was optimized for blocks and not suitable for
HEAP. The effect was the HEAP prefered table scans over ranges for btree
indexes.
- Fixed get_sweep_read_cost() for HEAP tables
- Ensure that range and ref have same cost for simple ranges
Added a small cost (MULTI_RANGE_READ_SETUP_COST) to ranges to ensure
we favior ref for range for simple queries.
- Fixed that matching_candidates_in_table() uses same number of records
as the rest of the optimizer
- Added avg_io_cost() to JT_EQ_REF cost. This helps calculate the cost for
HEAP and temporary tables better. A few tests changed because of this.
- heap::read_time() and heap::keyread_time() adjusted to not add +1.
This was to ensure that handler::keyread_time() doesn't give
higher cost for heap tables than for normal tables. One effect of
this is that heap and derived tables stored in heap will prefer
key access as this is now regarded as cheap.
- Changed cost for index read in sql_select.cc to match
multi_range_read_info_const(). All index cost calculation is now
done trough one function.
- 'ref' will now use quick_cost for keys if it exists. This is done
so that for '=' ranges, 'ref' is prefered over 'range'.
- scan_time() now takes avg_io_costs() into account
- get_delayed_table_estimates() uses block_size and avg_io_cost()
- Removed default argument to test_if_order_by_key(); simplifies code
This patch contains a full implementation of the optimization
that allows to use in-memory rowid / primary filters built for range
conditions over indexes. In many cases usage of such filters reduce
the number of disk seeks spent for fetching table rows.
In this implementation the choice of what possible filter to be applied
(if any) is made purely on cost-based considerations.
This implementation re-achitectured the partial implementation of
the feature pushed by Galina Shalygina in the commit
8d5a11122c.
Besides this patch contains a better implementation of the generic
handler function handler::multi_range_read_info_const() that
takes into account gaps between ranges when calculating the cost of
range index scans. It also contains some corrections of the
implementation of the handler function records_in_range() for MyISAM.
This patch supports the feature for InnoDB and MyISAM.