The logic and the implementation scheme are similar with the
MDEV-9197 Pushdown conditions into non-mergeable views/derived tables
How the push down is made on the example:
select * from t1
where a>3 and b>10 and
(a,b) in (select x,max(y) from t2 group by x);
-->
select * from t1
where a>3 and b>10 and
(a,b) in (select x,max(y)
from t2
where x>3
group by x
having max(y)>10);
The implementation scheme:
1. Search for the condition cond that depends only on the fields
from the left part of the IN subquery (left_part)
2. Find fields F_group in the select of the right part of the
IN subquery (right_part) that are used in the GROUP BY
3. Extract from the cond condition cond_where that depends only on the
fields from the left_part that stay at the same places in the left_part
(have the same indexes) as the F_group fields in the projection of the
right_part
4. Transform cond_where so it can be pushed into the WHERE clause of the
right_part and delete cond_where from the cond
5. Transform cond so it can be pushed into the HAVING clause of the right_part
The optimization is made in the
Item_in_subselect::pushdown_cond_for_in_subquery() and is controlled by the
variable condition_pushdown_for_subquery.
New test file in_subq_cond_pushdown.test is created.
There are also some changes made for setup_jtbm_semi_joins().
Now it is decomposed into the 2 procedures: setup_degenerate_jtbm_semi_joins()
that is called before optimize_cond() for cond and setup_jtbm_semi_joins()
that is called after optimize_cond().
New setup_jtbm_semi_joins() is made in the way so that the result of its work is
the same as if it was called before optimize_cond().
The code that is common for pushdown into materialized derived and into materialized
IN subqueries is factored out into pushdown_cond_for_derived(),
Item_in_subselect::pushdown_cond_for_in_subquery() and
st_select_lex::pushdown_cond_into_where_clause().
For this case we have a view that is mergeable but we are not able to merge it in the
parent select because that would exceed the maximum tables allowed in the join list, so we
materialise this view
TABLE_LIST::dervied is NULL for such views, it is only set for views which have ALGORITHM=TEMPTABLE
Fixed by making sure TABLE_LIST::derived is set for views that could not be merged
Lots of changes:
* calculate the current history partition in ::external_lock(),
not in ::write_row() or ::update_row()
* remove dynamically collected per-partition row_end stats
* no full table scan in open_table_from_share to calculate these
stats, no manual MDL/thr_locks in open_table_from_share
* no shared stats in TABLE_SHARE = no mutexes or condition waits when
calculating current history partition
* always compare timestamps, don't convert them to MYSQL_TIME
(avoid DST ambiguity, and it's faster too)
* correct interval handling, 1 month = 1 month, not 30 * 24 * 3600 seconds
* save/restore first partition start time, and count intervals from there
* only allow to drop first partitions if INTERVAL
* when adding new history partitions, split the data in the last history
parition, if it was overflowed
* show partition boundaries in INFORMATION_SCHEMA.PARTITIONS
is not supported
Allowed to use recursive references in derived tables.
As a result usage of recursive references in operands of
INTERSECT / EXCEPT is now supported.
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
This will make it easier to how memory allocation is done when debugging
with either DBUG or gdb.
Will especially help when debugging stored procedures
Main change is a name argument as second argument to init_alloc_root()
init_sql_alloc()
Other things:
- Added DBUG_ENTER/EXIT to some Virtual_tmp_table functions
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db
Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
Now we don't open partitions if it was explicitly cpecified.
ha_partition::m_opened_partition bitmap added to track
partitions that were actually opened.
and the system_versioning_transaction_registry variable.
The user enables transaction registry by specifying BIGINT for
row_start/row_end columns.
check mysql.transaction_registry structure on the first open,
not on startup. Avoid warnings unless transaction_registry
is actually used.
Many related changes.
Note that AS OF condition must always be pushed down to physical tables,
it cannot be applied to a derived or a view. Thus:
* no versioning for internal temporary tables, they can never store
historical data.
* remove special versioning code from mysql_derived_prepare and
remove ER_VERS_DERIVED_PROHIBITED - derived can have no historical
data and cannot be prohibited for system versioning related reasons.
* do not expand select list for derived/views with sys vers fields,
derived/views can never have historical data.
* remove special invisiblity rules for sys vers fields, they are no
longer needed after the previous change
* remove system_versioning_hide, it lost the meaning after the
previous change.
* remove ER_VERS_SYSTEM_TIME_CLASH, it's no "clash", the inner
AS OF clause always wins.
* non-versioned fields in a historical query
reword the warning text, downgrade to note, don't
replace values with NULLs
trx_undo_page_report_modify(): For SPATIAL INDEX, keep logging
updated off-page columns twice, so that
the minimum bounding rectangle (MBR) will be logged.
Avoiding the redundant logging would require larger changes
to the undo log format.
row_build_index_entry_low(): Handle SPATIAL_UNKNOWN more robustly,
by refusing to purge the record from the spatial index.
We can get this code when processing old undo log from 10.2.10 or
10.2.11 (the releases affected by MDEV-14799, which was a regression
from MDEV-14051).
If translation table present when we materialize the derived table then
change it to point to the materialized table.
Added debug info to see really what happens with what derived.
Other changes done to get this to work:
- Added 'internal_tables' to TABLE object to list which sequence tables
is needed to use the table.
- Mark any expression using DEFAULT() with LEX->default_used.
This is needed when deciding if we should open internal sequence
tables when a table is opened (we don't need to open sequence tables
if the main table is only used with SELECT).
- Create_and_open_temporary_table() can now also open all internal
sequence tables.
- Added option MYSQL_LOCK_USE_MALLOC to mysql_lock_tables()
to force memory allocation to be used with malloc instead of
memroot.
- Added flag to MYSQL_LOCK to remember if allocation was done with
malloc or memroot (makes code simpler and safer).
- init_one_table_for_prelocking() now takes argument for what lock to
use instead of it's a routine or something else.
- Renamed prelocking placeholders to make them more understandable as
they are now used in more code.
- Changed test in check_lock_and_start_stmt() if found table has correct
locks. The old test didn't work for tables that has lock
TL_WRITE_ALLOW_WRITE, which is what sequence tables are using.
- Added VCOL_NOT_VIRTUAL option to ensure that sequence functions can't
be used with virtual columns
- More sequence tests
Merge branch '10.3' into trunk
Both field_visibility and VERS_HIDDEN_FLAG exist independently.
TODO:
VERS_HIDDEN_FLAG should be replaced with SYSTEM_INVISIBLE (or COMPLETELY_INVISIBLE?).
Feature Definition:-
This feature adds invisible column functionality to server.
There is 4 level of "invisibility":
1. Not invisible (NOT_INVISIBLE) — Normal columns created by the user
2. A little bit invisible (USER_DEFINED_INVISIBLE) — columns that the
user has marked invisible. They aren't shown in SELECT * and they
don't require values in INSERT table VALUE (...). Otherwise
they behave as normal columns.
3. More invisible (SYSTEM_INVISIBLE) — Can be queried explicitly,
otherwise invisible from everything. Think ROWID sytem column.
Because they're invisible from ALTER TABLE and from CREATE TABLE
they cannot be created or dropped, they're created by the system.
User cant not create a column name which is same as of
SYSTEM_INVISIBLE.
4. Very invisible (COMPLETELY_INVISIBLE) — as above, but cannot be
queried either. They can only show up in EXPLAIN EXTENDED (might
be possible for a very invisible indexed virtual column) but
otherwise they don't exist for the user.If user creates a columns
which has same name as of COMPLETELY_INVISIBLE then
COMPLETELY_INVISIBLE column is renamed again. So it is completely
invisible from user.
Invisible Index(HA_INVISIBLE_KEY):-
Creation of invisible columns require a new type of index which
will be only visible to system. User cant see/alter/create/delete
this index. If user creates a index which is same name as of
invisible index then it will be renamed.
Syntax Details:-
Only USER_DEFINED_INVISIBLE column can be created by user. This
can be created by adding INVISIBLE suffix after column definition.
Create table t1( a int invisible, b int);
Rules:-
There are some rules/restrictions related to use of invisible columns
1. All the columns in table cant be invisible.
Create table t1(a int invisible); \\error
Create table t1(a int invisible, b int invisble); \\error
2. If you want invisible column to be NOT NULL then you have to supply
Default value for the column.
Create table t1(a int, b int not null); \\error
3. If you create a view/create table with select * then this wont copy
invisible fields. So newly created view/table wont have any invisible
columns.
Create table t2 as select * from t1;//t2 wont have t1 invisible column
Create view v1 as select * from t1;//v1 wont have t1 invisible column
4. Invisibility wont be forwarded to next table in any case of create
table/view as select */(a,b,c) from table.
Create table t2 as select a,b,c from t1; // t2 will have t1 invisible
// column(b), but this wont be invisible in t2
Create view v1 as select a,b,c from t1; // v1 will have t1 invisible
// column(b), but this wont be invisible in v1
Implementation Details:-
Parsing:- INVISIBLE_SYM is added into vcol_attribute(so its like unique
suffix), It is also added into keyword_sp_not_data_type so that table
can have column with name invisible.
Implementation detail is given by each modified function/created function.
(Some function are left as they were self explanatory)
(m= Modified, n= Newly Created)
mysql_prepare_create_table(m):- Extra checks for invisible columns are
added. Also some DEBUG_EXECUTE_IF are also added for test cases.
mysql_prepare_alter_table(m):- Now this will drop all the
COMPLETELY_INVISIBLE column and HA_INVISIBLE_KEY index. Further
Modifications are made to stop drop/change/delete of SYSTEM_INVISIBLE
column.
build_frm_image(m):- Now this allows incorporating field_visibility
status into frm image. To remain compatible with old frms
field_visibility info will be only written when any of the field is
not NOT_INVISIBLE.
extra2_write_additional_field_properties(n):- This will write field
visibility info into buffer. We first write EXTRA2_FIELD_FLAGS into
buffer/frm , then each next char will have field_visibility for each
field.
init_from_binary_frm_image(m):- Now if we get EXTRA2_FIELD_FLAGS,
then we will read the next n(n= number of fields) chars and set the
field_visibility. We also increment
thd->status_var.feature_invisible_columns. One important thing to
note if we find out that key contains a field whose visibility is
> USER_DEFINED_INVISIBLE then , we declare this key as invisible
key.
sql_show.cc is changed accordingly to make show table, show keys
correct.
mysql_insert(m):- If we get to know that we are doing insert in
this way insert into t1 values(1,1); without explicitly specifying
columns, then we check for if we have invisible fields if yes then
we reset the whole record, Why ? Because first we want hidden columns
to get default/null value. Second thing auto_increment has property
no default and no null which voilates invisible key rule 2, And
because of this it was giving error. Reseting table->record[0]
eliminates this issue. More info put breakpoint on handler::write_row
and see auto_increment value.
fill_record(m):- we continue loop if we find invisible column because
this is already reseted/will get its value if it is default.
Test cases:- Since we can not directly add > USER_DEFINED_INVISIBLE
column then I have debug_dbug to create it in mysql_prepare_create_table.
Patch Credit:- Serg Golubchik
* again, as in 10.2, NOW is a keyword only if followed by parentheses
* use AS OF CURRENT_TIMESTAMP or AS OF NOW()
* AS OF CURRENT_TIMESTAMP and AS OF NOW() mean AS OF NOW(6),
not AS OF NOW(0), (same behavior as in a DEFAULT clause)