- Added sql/mariadb.h file that should be included first by files in sql
directory, if sql_plugin.h is not used (sql_plugin.h adds SHOW variables
that must be done before my_global.h is included)
- Removed a lot of include my_global.h from include files
- Removed include's of some files that my_global.h automatically includes
- Removed duplicated include's of my_sys.h
- Replaced include my_config.h with my_global.h
"Optimization for equi-joins of derived tables with GROUP BY"
should be considered rather as a 'proof of concept'.
The task itself is targeted at an optimization that employs re-writing
equi-joins with grouping derived tables / views into lateral
derived tables. Here's an example of such transformation:
select t1.a,t.max,t.min
from t1 [left] join
(select a, max(t2.b) max, min(t2.b) min from t2
group by t2.a) as t
on t1.a=t.a;
=>
select t1.a,tl.max,tl.min
from t1 [left] join
lateral (select a, max(t2.b) max, min(t2.b) min from t2
where t1.a=t2.a) as t
on 1=1;
The transformation pushes the equi-join condition t1.a=t.a into the
derived table making it dependent on table t1. It means that for
every row from t1 a new derived table must be filled out. However
the size of any of these derived tables is just a fraction of the
original derived table t. One could say that transformation 'splits'
the rows used for the GROUP BY operation into separate groups
performing aggregation for a group only in the case when there is
a match for the current row of t1.
Apparently the transformation may produce a query with a better
performance only in the case when
- the GROUP BY list refers only to fields returned by the derived table
- there is an index I on one of the tables T used in FROM list of
the specification of the derived table whose prefix covers the
the fields from the proper beginning of the GROUP BY list or
fields that are equal to those fields.
Whether the result of the re-writing can be executed faster depends
on many factors:
- the size of the original derived table
- the size of the table T
- whether the index I is clustering for table T
- whether the index I fully covers the GROUP BY list.
This patch only tries to improve the chosen execution plan using
this transformation. It tries to do it only when the chosen
plan reaches the derived table by a key whose prefix covers
all the fields of the derived table produced by the fields of
the table T from the GROUP BY list.
The code of the patch does not evaluates the cost of the improved
plan. If certain conditions are met the transformation is applied.
Do not run the window function computation step when the select
produces no rows (zero_result_cause!=NULL).
This may cause reads from uninitialized memory.
We still need to run the window function computation step when
the output includes just one row (for example
SELECT MAX(col), RANK() OVER (...) FROM t1 WHERE 1=0).
This fix also resolves an issue with queries with window functions
producing an output row where should be none, like in
SELECT ROW_NUMBER() FROM t1 WHERE 1=0.
Updated a few test results in the existing tests to reflect this.
There was a missing test in CTE handling if creating a temporary table
failed (in this case as a result of out of space). This caused a table
handler to be used even if it was not allocated.
- Added variable tmp_disk_table_size
- Added variable tmp_memory_table_size as an alias for tmp_table_size
- Changed internal variable tmp_table_size to tmp_memory_table_size
- create_info.data_file_length is now set with tmp_disk_table_size
- Fixed that Aria doesn't reset max_data_file_length for internal tables
- Added status flag if table is full so that we can detect this on next insert.
This ensures that the table is always 'correct', but we get the error one
row after the row that grow the table too big.
- Removed some mutex lock for internal temporary tables
This is another attempt to fix the bug mdev-12992.
This patch introduces st_select_lex::context_analysis_place for
the place in SELECT where context analysis is currently performed.
It's similar to st_select_lex::parsing_place, but it is used at
the preparation stage.
This is actually a legacy bug:
SQL_SELECT::test_quick_select() was called
with SQL_SELECT::head not set.
It looks like that this problem can be
reproduced only on queries with ORDER BY
that use IN predicates converted to semi-joins.
This patch corrects the fix for bug mdev-7599.
When the min/max optimization of the function
opt_sum_query() optimizes away all tables of
a subquery it should not ever be rolled back.
If the optimizer chose an execution plan where
a semi-join nest were materialized and the
result of materialization was scanned to access
other tables by ref access it could build a key
over columns of the tables from the nest that
were actually inaccessible.
The patch performs a proper check whether a key
that uses columns of the tables from a materialized
semi-join nest can be employed to access outer tables.
This excludes MDEV-12472 (InnoDB should accept XtraDB parameters,
warning that they are ignored). In other words, MariaDB 10.3 will not
recognize any XtraDB-specific parameters.
This corrects the patch for mdev-10006.
The current code supports only those semi-join nests that are placed at
the join top level. So such nests cannot depend on other tables or nests.
This is a joint patch fixing the following problems:
MDEV-12875 Wrong VIEW column data type for COALESCE(int_column)
MDEV-12886 Different default for INT and BIGINT column in a VIEW for a SELECT with ROLLUP
MDEV-12916 Wrong column data type for an INT field of a cursor-anchored ROW variable
All above problem happened because the global function ::create_tmp_field()
called the top-level Item::create_tmp_field(), which made some tranformation
for INT-result data types. For example, INT(11) became BIGINT(11), because 11
is a corner case and it's not known if it fits or does not fit into INT range,
so Item::create_tmp_field() converted it to BIGINT(11) for safety.
The main idea of this patch is to avoid such tranformations.
1. Fixing Item::create_tmp_field() not to have a special case for INT_RESULT.
Item::create_tmp_field() is changed not to have a special case
for INT_RESULT (which earlier made a decision based on Item's max_length).
It now calls tmp_table_field_from_field_type() for INT_RESULT,
therefore preserves the original data type (e.g. INT, YEAR) without
conversion to BIGINT.
This change is valid, because a number of recent fixes
(e.g. in Item_func_int, Item_hybrid_func, Item_int, Item_splocal)
guarantee that item->type_handler() now properly returns
type_handler_long vs type_handler_longlong. So no adjustment by length
is needed any more for Items returning INT_RESULT.
After this change, Item::create_tmp_field() calls
tmp_table_field_from_field_type() for all XXX_RESULT, except REAL_RESULT.
2. Fixing Item::create_tmp_field() not to have a special case for REAL_RESULT.
Note, the reason for a special case for REAL_RESULT is to have a special
constructor for Field_double(), forcing Field_real::not_fixed to be set
to true.
Taking into account that only Item_sum descendants actually need a special
constructor call Field_double(not_fixed=true), not too loose precision
when mixing individual rows to the aggregate result:
- renaming Item::create_tmp_field() to Item_sum::create_tmp_field().
- changing Item::create_tmp_field() just to call
tmp_table_field_from_field_type() for all XXX_RESULT types.
A special case for REAL_RESULT in Item::create_tmp_field() is now gone.
Item::create_tmp_field() is now symmetric for all XXX_RESULT types,
and now just calls tmp_table_field_from_field_type().
3. Fixing Item_func::create_field_for_create_select() not to have
a special case for STRING_RESULT.
After changes #1 and #2, the code in
Item_func::create_field_for_create_select(), testing result_type(),
becomes useless, because: now Item::create_tmp_field() and
tmp_table_field_from_field_type() do exactly the same thing for all
XXX_RESULT types for Item_func descendants:
a. It calls tmp_table_field_from_field_type for STRING_RESULT directly.
b. For other XXX_RESULT, it goes through Item::create_tmp_field(),
which calls the global function ::create_tmp_field(),
which calls item->create_tmp_field() for FUNC_ITEM,
which calls tmp_table_field_from_field_type() again.
So removing the virtual implementation of
Item_func::create_field_for_create_select().
The inherited Item::create_field_for_create_select() now perfectly
does the job, as it also calls tmp_table_field_from_field_type()
for FUNC_ITEM, independently from XXX_RESULT type.
4. Taking into account #1 and #2, as well as some recent changes,
removing virtual implementations:
- Item_hybrid_func::create_tmp_field()
- Item_hybrid_func::create_field_for_create_select()
- Item_int_func::create_tmp_field()
- Item_int_func::create_field_for_create_select()
- Item_temporal_func::create_field_for_create_select()
The derived versions from Item now perfectly work.
5. Moving a piece of code from create_tmp_field_from_item()
to a new function create_tmp_field_from_item_finalize(),
to reuse it in two places (see #6).
6. Changing the code responsible for BIT->INT/BIGIN tranformation
(which is called for the cases when the created table, e.g. HEAP,
does not fully support BIT) not to call create_tmp_field_from_item(),
because the latter now calls tmp_table_field_from_field_type() instead
of create_tmp_field() and thefore cannot do BIT transformation.
So rewriting this code using a sequence of these calls:
- item->type_handler_long_or_longlong()
- handler->make_and_init_table_field()
- create_tmp_field_from_item_finalize()
7. Miscelaneous changes:
- Moving type_handler_long_or_longlong() from "protected" to "public",
as it's now needed in the global function create_tmp_field().
8. The above changes fixed MDEV-12875, MDEV-12886, MDEV-12916.
So adding tests for these bugs.
MDEV-12858 Out-of-range error for CREATE..SELECT unsigned_int_column+1
MDEV-12859 Out-of-range error for CREATE..SELECT @a:=EXTRACT(MINUTE_MICROSECOND FROM..)
MDEV-12862 Data type of @a:=1e0 depends on the session character set
1. Moving a part of Item::create_tmp_field() into a new helper method
Item::create_tmp_field_int() and reusing it in Item::create_tmp_field()
and Item_func_signed::create_tmp_field().
Fixing the code in Item::create_tmp_field_int() to call
Type_handler::make_table_field() instead of doing "new Field_long[long]"
directly. This change revealed a problem reported in MDEV-12862.
2. Changing the "long vs longlong" cut-off length for
- Item_func::create_tmp_field()
- Item_sum::create_tmp_field()
- Item_func_get_user_var::create_tmp_field()
from MY_INT32_NUM_DECIMAL_DIGITS to (MY_INT32_NUM_DECIMAL_DIGITS - 2).
This fixes MDEV-12858.
After this change, the "convert_int_length" parameter to
Item::create_tmp_field() is not needed any more, because
(MY_INT32_NUM_DECIMAL_DIGITS - 2) is always passed.
So removing the "convert_int_length" parameter.
3. Fixing Item::create_tmp_field() to pass max_char_length() instead
of max_length to the constructor of Field_double().
This fixes MDEV-12862.
4. Additionally, fixing
- Type_handler_{tiny|short|int24|long|longlong}::make_table_field()
- Type_handler_{float|double}::make_table_field()
to pass max_char_length() instead of max_length to Field contructors.
This is needed by the change (1).
5. Adding new tests, and recording new correct results in the old tests in:
- mysql-test/r/type_ranges.result
- storage/tokudb/mysql-test/tokudb/r/type_ranges.result
Significantly reduce the amount of InnoDB, XtraDB and Mariabackup
code changes by defining pfs_os_file_t as something that is
transparently compatible with os_file_t.
In some rare cases queries with UNION ALL
using a derived table specified by
a grouping select with a subquery in WHERE and
impossible HAVING detected after constant row
substitution could hang.
The cause was not a proper return from the
function subselect_single_select_engine::exec()
in the case when the subquery was not optimized
beforehand and the optimization performed
in this function requested for a change of the
subquery engine. This was fixed.
Also a change was applied that avoided execution
of a subquery if impossible having was detected
for the main query at the optimization stage.
When an IN subquery predicate was converted to a semi-join that were
materialized and the result of the materialization happened to be
the last in the execution plan then any conjunctive condition with RAND()
turned out to be lost.
Fixed by attaching this condition to the last top base table.
JOIN_TAB::remove_redundant_bnl_scan_conds() removes select_cond
from a JOIN_TAB if join cache is enabled, and tab->cache_select->cond
is the equal to tab->select_cond.
But after 8d99166c69 the code to initialize join cache was moved
to happen much later than JOIN_TAB::remove_redundant_bnl_scan_conds(),
and that code might, under certain conditions, revert to *not* using
join cache (set_join_cache_denial()).
If JOIN_TAB::remove_redundant_bnl_scan_conds() removes the WHERE
condition from the JOIN_TAB and later set_join_cache_denial() disables
join cache, we end up with no WHERE condition at all.
Fix: move JOIN_TAB::remove_redundant_bnl_scan_conds() to happen
after all possible set_join_cache_denial() calls.
This patch corrects the fix for the bug mdev-10693.
It is critical for the function get_best_combination() not to call
create_ref_for_key() for constant tables.
This bug could manifest itself only in multi-table subqueries where
one of the tables is accessed by a constant primary key.
The usage of windows functions when all tables were optimized away
by min/max optimization were not supported. As result a result,
the queries that used window functions with min/max aggregation
over the whole table returned wrong result sets.
The patch fixed this problem.