In the function QUICK_RANGE_SELECT::init_ror_merged_scan we create a seperate handler if the handler in
head->file cannot be reused. The flag free_file tells us if we have a seperate handler or not.
There are cases where you might create a handler and then there might be a failure(running ALTER)
and then we have to revert the handler back to the original one. The code does that
but it does not reset the flag 'free_file' in this case.
Also backported f2c418079d.
The problem here is EITS statistics does not calculate statistics for the partitions of the table.
So a temporary solution would be to not read EITS statistics for partitioned tables.
Also disabling reading of EITS for columns that participate in the partition list of a table.
In this case we were trying to access memory for key_parts which we did not
assign for a fields because it did not any EITS statistics.
The check if EITS statistics for a column is avaialable or not was missing.
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.
main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
For index merge union[or sort union], the estimates are not taken into account while calculating the selectivity of
a condition. So instead of showing the estimates of the index merge union[or sort union], it shows estimates equal to
all the records of the table.
The fix for the issue is to include the selectivity of index merge
union[or sort union] while calculating the selectivity of a condition.
optimizer_use_condition_selectivity>=3
Selectivity analysis should be disabled for Geometrical columns
for the case like geometric_field= string_constant.
Currently for selectivity calculation we perform range analysis for a column even when we don't have any statistics(EITS).
This makes less sense but is used to catch contradiction for WHERE condition.
So the solution is to not perform range analysis for selectivity calculation for columns that do not have statistics.
and use_stat_tables= PREFERABLY
Currently the code that calculates selectivity for a table does not take into account the case when
we can have GROUP BY optimization (looses index scan).
This patch introduces support for the system variable eq_range_index_dive_limit
that existed in MySQL starting from 5.6. The variable sets a limit for
index dives into equality ranges. Index dives are performed by optimizer
to estimate the number of rows in range scans. Index dives usually provide
good estimate but they are pretty expensive. To estimate the number of rows
in equality ranges statistical data on indexes can be employed. Its usage gives
not so good estimates but it's cheap. So if the number of equality dives
required by an index scan exceeds the set limit no dives for equality
ranges are performed by the optimizer for this index.
As the new system variable is introduced in a stable version the default
value for it is set to a special value meaning there is no limit for the number
of index dives performed by the optimizer.
The patch partially uses the MySQL code for WL 5957
'Statistics-based Range optimization for many ranges'.
Now the boolean data type is preserved in hybrid functions and MIN/MAX,
so COALESCE(bool_expr,bool_expr) and MAX(bool_expr) are correctly
detected by JSON_OBJECT() as being boolean rather than numeric expressions.
- Removing tests of item->type() against INT_ITEM and replacing
them to calls of new method item->is_bool_literal().
- Changing constant conditions to use Item_bool() instead of Item_int().
Lots of changes:
* calculate the current history partition in ::external_lock(),
not in ::write_row() or ::update_row()
* remove dynamically collected per-partition row_end stats
* no full table scan in open_table_from_share to calculate these
stats, no manual MDL/thr_locks in open_table_from_share
* no shared stats in TABLE_SHARE = no mutexes or condition waits when
calculating current history partition
* always compare timestamps, don't convert them to MYSQL_TIME
(avoid DST ambiguity, and it's faster too)
* correct interval handling, 1 month = 1 month, not 30 * 24 * 3600 seconds
* save/restore first partition start time, and count intervals from there
* only allow to drop first partitions if INTERVAL
* when adding new history partitions, split the data in the last history
parition, if it was overflowed
* show partition boundaries in INFORMATION_SCHEMA.PARTITIONS
There were two problems related to the bug report:
1. Item_datetime::get_date() was not implemented.
So execution went through val_int() followed
by int-to-datetime or int-to-time conversion.
This was the reason why the optimizer did not
work well on data with fractional seconds.
2. Item_datetime::set() did not have a TIME specific code
to mix months and days to hours after unpack_time().
This is why the optimizer did not work well with negative
TIME values, as well as huge time values.
Changes:
1. Overriding Item_datetime::get_date(), to return ltime.
This fixes the problem N1.
2. Cleanup: Moving pack_time() and unpack_time() from
sql-common/my_time.c and include/my_time.h to
sql/sql_time.cc and sql/sql_time.h, as they are not needed
on the client side.
3. Adding a new "enum_mysql_timestamp_type ts_type" parameter
to unpack_time() and moving the TIME specific code to mix
months and days with hours inside unpack_time().
Adding a new "ts_type" parameter to Item_datetime::set(),
to pass it from the caller down to unpack_time().
So now the TIME specific code is automatically called
from Item_datetime::set(). This fixes the problem N2.
This change also helped to get rid of duplicate TIME specific code
from other three places, where mixing month/days to hours
was done immediately after unpack_time().
Moving the DATE specific code to zero hhmmssff
from Item_func_min_max::get_date_native to inside unpack_time(),
for symmetry.
4. Removing the virtual method in_vector::result_type(),
adding in_vector::type_handler() instead.
This helps to get result_type(), field_type(),
mysql_timestamp_type() of an in_vector easier.
Passing type_handler()->mysql_timestamp_type() as
a new parameter to Item_datetime::set() inside
in_temporal::value_to_item().
5. Cleaup: Removing separate implementations of in_datetime::get_value()
and in_time::get_value(). Adding a single implementation
in_temporal::get_value() instead.
Passing type_handler()->field_type() to get_value_internal().
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
This will make it easier to how memory allocation is done when debugging
with either DBUG or gdb.
Will especially help when debugging stored procedures
Main change is a name argument as second argument to init_alloc_root()
init_sql_alloc()
Other things:
- Added DBUG_ENTER/EXIT to some Virtual_tmp_table functions
TRASH was mapped to TRASH_FREE and was supposed to be used for memory
that should not be accessed anymore, while TRASH_ALLOC() is to be
used for uninitialized but to-be-used memory.
But sometimes TRASH() was used in the latter sense.
Remove TRASH() macro, always use explicit TRASH_ALLOC() or TRASH_FREE().
Other things, mainly to get
create_mysqld_error_find_printf_error tool to work:
- Added protection to not include mysqld_error.h twice
- Include "unireg.h" instead of "mysqld_error.h" in server
- Added protection if ER_XX messages are already defined
- Removed wrong calls to my_error(ER_OUTOFMEMORY) as
my_malloc() and my_alloc will do this automatically
- Added missing %s to ER_DUP_QUERY_NAME
- Removed old and wrong calls to my_strerror() when using
MY_ERROR_ON_RENAME (wrong merge)
- Fixed deadlock error message from Galera. Before the extra
information given to ER_LOCK_DEADLOCK was missing because
ER_LOCK_DEADLOCK doesn't provide any extra information.
I kept #ifdef mysqld_error_find_printf_error_used in sql_acl.h
to make it easy to do this kind of check again in the future
Code in QUICK_RANGE_SELECT::init_ror_merged_scan() could theoretically
have caused crashes if this was ever called from an update or delete
This also found a bug in the vcol/range.result. file.
Other things:
- Cleanup of allocated bitmaps done in open(), which
simplifies init_partition_bitmaps()
- Add needed defines in ha_spider.cc to enable new spider code
- Fixed some DBUG_PRINT() to be consistent with normal code
- Removed end space
- The changes in test cases partition_innodb, partition_range,
partition_pruning etc are becasue partitions can now more exactly
calculate the number of rows in a range.
Contains spider patches:
014,015,023,033,035,037,040,042,044,045,049,050,051,053,059
- Fix win64 pointer truncation warnings
(usually coming from misusing 0x%lx and long cast in DBUG)
- Also fix printf-format warnings
Make the above mentioned warnings fatal.
- fix pthread_join on Windows to set return value.
Before this patch running full mtr generated some 70 cores (at least
on systemd). Now no cores should be generated.
- Changed DBUG_ABORT()'s used by mysql-test-run to DBUG_SUICIDE()
- Changed DBUG_ABORT() used to crash server with core to DBUG_ASSERT(0)
- DBUG_ASSERT now flushes DBUG files
- Added sql/mariadb.h file that should be included first by files in sql
directory, if sql_plugin.h is not used (sql_plugin.h adds SHOW variables
that must be done before my_global.h is included)
- Removed a lot of include my_global.h from include files
- Removed include's of some files that my_global.h automatically includes
- Removed duplicated include's of my_sys.h
- Replaced include my_config.h with my_global.h
In get_mm_tree we have to change Field_geom::geom_type to
GEOMETRY as we have to let storing all types of the spatial features
in the field. So now we restore the original geom_type as it's
done.
Other things
- Ensure that ut_d() is set to EXPR if ut_ad() is DEBUG_ASSERT()
If not, we will get a crash in purge_sys_t::~purge_sys_t() as
this ut_ad() code expect's that the ut_d() codes has been executed
* based on RANGE pruning by COLUMNS (sys_trx_end) condition
* removed DEFAULT; AS OF NOW is always last; current VERSIONING as last non-empty (or first empty)
* Min/Max stats in TABLE_SHARE
* ALTER TABLE ADD PARTITION adds before AS OF NOW partition
This is a joint patch for:
- MDEV-12426 Add Field::type_handler()
- MDEV-12432 Range optimizer for ENUM and SET does not return "Impossible WHERE" in some case
With the new type handler approach being added to Field, it was easier to fix
MDEV-12432 rather than to reproduce the old ENUM/SET behavior.
The patch does the following:
1. Adds Field::type_handler(), according to the task description.
2. Fixes the asymmetry between Fields and Items of ENUM and SET field types.
Field_enum::cmp_type() returned INT_RESULT
Item*::cmp_type() returned STRING_RESULT for ENUM and SET expressions
This asymmetry was originally done for easier coding in the optimizer sources.
However, in 10.1 we moved a lot of code to methods of the class Field:
- test_if_equality_guarantees_uniqueness()
- can_be_substituted_to_equal_item()
- get_equal_const_item()
- can_optimize_keypart_ref()
- can_optimize_hash_join()
- can_optimize_group_min_max()
- can_optimize_range()
- can_optimize_outer_join_table_elimination()
As of 10.2 only a few lines of the code in opt_range.cc, field.cc and field.h
still relayed on the fact that Field_enum::cmp_type() returns INT_RESULT:
- Some asserts in field.cc
- Field_year::get_copy_func()
- Item_func_like::get_mm_leaf()
- Item_bool_func::get_mm_leaf()
These lines have been fixed.
3. Item_bool_func::get_mm_leaf() did not work well for ENUM/SET,
see MDEV-12432. So the ENUM/SET code was rewritten, and the relevant
code in Field_enum::store() and Field_set::store() was fixed to
properly return errors to the caller.
4. The result of Field_decimal::result_type() was changed from REAL_RESULT
to DECIMAL_RESULT. Data type aggregation (e.g. in COALESCE()) is now more
precise for old DECIMAL, because Item::decimal_precision() now goes through
the DECIMAL_RESULT branch. Earlier it went through the REAL_RESULT branch.
Benefits of this patch:
- Removed a lot of calls to strlen(), especially for field_string
- Strings generated by parser are now const strings, less chance of
accidently changing a string
- Removed a lot of calls with LEX_STRING as parameter (changed to pointer)
- More uniform code
- Item::name_length was not kept up to date. Now fixed
- Several bugs found and fixed (Access to null pointers,
access of freed memory, wrong arguments to printf like functions)
- Removed a lot of casts from (const char*) to (char*)
Changes:
- This caused some ABI changes
- lex_string_set now uses LEX_CSTRING
- Some fucntions are now taking const char* instead of char*
- Create_field::change and after changed to LEX_CSTRING
- handler::connect_string, comment and engine_name() changed to LEX_CSTRING
- Checked printf() related calls to find bugs. Found and fixed several
errors in old code.
- A lot of changes from LEX_STRING to LEX_CSTRING, especially related to
parsing and events.
- Some changes from LEX_STRING and LEX_STRING & to LEX_CSTRING*
- Some changes for char* to const char*
- Added printf argument checking for my_snprintf()
- Introduced null_clex_str, star_clex_string, temp_lex_str to simplify
code
- Added item_empty_name and item_used_name to be able to distingush between
items that was given an empty name and items that was not given a name
This is used in sql_yacc.yy to know when to give an item a name.
- select table_name."*' is not anymore same as table_name.*
- removed not used function Item::rename()
- Added comparision of item->name_length before some calls to
my_strcasecmp() to speed up comparison
- Moved Item_sp_variable::make_field() from item.h to item.cc
- Some minimal code changes to avoid copying to const char *
- Fixed wrong error message in wsrep_mysql_parse()
- Fixed wrong code in find_field_in_natural_join() where real_item() was
set when it shouldn't
- ER_ERROR_ON_RENAME was used with extra arguments.
- Removed some (wrong) ER_OUTOFMEMORY, as alloc_root will already
give the error.
TODO:
- Check possible unsafe casts in plugin/auth_examples/qa_auth_interface.c
- Change code to not modify LEX_CSTRING for database name
(as part of lower_case_table_names)
The patch actually fixes the old defect of the optimizer that
could not extract keys for range access from IN predicates
with row arguments.
This problem was resolved in the mysql-5.7 code. The patch
supersedes what was done there:
- it can build range access when not all components of
the first row argument are refer to the columns of the table
for which the range access is constructed.
- it can use equality predicates to build range access
to the table that is not referred to in this argument.
In get_mm_tree we have to change Field_geom::geom_type to
GEOMETRY as we have to let storing all types of the spatial features
in the field. So now we restore the original geom_type as it's
done.
Also, implement MDEV-11027 a little differently from 5.5 and 10.0:
recv_apply_hashed_log_recs(): Change the return type back to void
(DB_SUCCESS was always returned).
Report progress also via systemd using sd_notifyf().
* rename to "keyread" (to avoid conflicts with tokudb),
* change from bool to uint and store the keyread index number there
* provide a bool accessor to check if keyread is enabled
mark_columns_used_by_index used to do
reset + mark_columns_used_by_index_no_reset + start keyread + set bitmaps
Now prepare_for_keyread does that, while mark_columns_used_by_index
does only reset + mark_columns_used_by_index_no_reset,
just as its name suggests.
move TABLE::key_read into handler. Because in index merge and DS-MRR
there can be many handlers per table, and some of them use
key read while others don't. "keyread" is really per handler,
not per TABLE property.
When building different range and index-merge trees the range optimizer
could build an index-merge tree with an index scan containing less ranges
then needed. This index-merge could be chosen as the best. Following this
index-merge the executioner missed some rows in the result set.
The invalid index scan was built due to an inconsistency in the code
back-ported from mysql into 5.3 that fixed mysql bug #11765831:
the code added to key_or() could change shared keys of the second
ored tree. Partially the problem was fixed in the patch for mariadb
bug #823301, but it turned out that only partially.
Found and fixed 2 problems:
- Filesort addon fields didn't mark virtual columns properly
- multi-range-read calculated vcol bitmap but was not using it.
This caused wrong vcol field to be calculated on read, which caused the assert.
In file sql/opt_range.cc,when calculate_cond_selectivity_for_table() is called with optimizer_use_condition_selectivity=4 then
- thd->no_errors is set to 1
- the original value of thd->no_error is not restored to its original value
- this is causing the assertion to fail in the subsequent queries
Fixed by restoring the original value of thd->no_errors
In the function create_key_parts_for_pseudo_indexes()
the key part structures of pseudo-indexes created for
BLOB fields were set incorrectly.
Also the key parts for long fields must be 'truncated'
up to the maximum length acceptable for key parts.
Implement a technique mentioned in the MDEV. Under certain conditions,
cond(inner_table.col) can be substituted for cond(outer_table.col) for
the purpose of range analysis.
Fix get_quick_keys(): When building range tree from a condition
in form
keypart1=const AND (keypart2 < 0 OR keypart2>=0)
the SEL_ARG for keypart2 represents an interval (-inf, +inf).
However, the logic that sets UNIQUE_RANGE flag fails to recognize
this, and sets UNIQUE_RANGE flag if (keypart1, keypart2) covered
a unique key.
As a result, range access executor assumes the interval can have
at most one row and only reads the first row from it.
The crash was caused by this problem:
get_best_group_min_max() tries to construct query plans for keys that
are not processed by the range optimizer. This wasn't a problem as long
as SEL_TREE::keys was an array of MAX_KEY elements.
However, now it is a Mem_root_array and only has elements for the used
keys, and get_best_group_min_max attempts to address beyond the end of
the array.
The obvious way to fix the crash was to port (and improve) a part of
96fcfcbd7b5120e8f64fd45985001eca8d36fbfb from mysql-5.7. This makes
get_best_group_min_max not to consider indexes that Mem_root_arrays
have no element for.
After that, I got non-sensical query plans (see MDEV-10325 for details).
Fixed that by making get_best_group_min_max to check if the index is in
table->keys_in_use_for_group_by bitmap.
The problem was that the loop in get_func_mm_tree()
accessed improperly initialized instances of String,
which resided in the bzero'ed part of the in_vector::base array.
Strings in in_vector::base are originally initialized
in Item_func_in::fix_length_and_dec(),
in in_vector::in_vector() using sql_calloc,
rather than using a String constructor, so their str_charset
members are originally equal to NULL.
Strings in in_vector::base are later initialized
to good values in Item_func_in::fix_length_and_dec(),
using array->set(), in this code:
uint j=0;
for (uint i=1 ; i < arg_count ; i++)
{
array->set(j,args[i]);
if (!args[i]->null_value) // Skip NULL values
j++;
else
have_null= 1;
}
if ((array->used_count= j))
array->sort();
NULLs are not taken into account, so at the end
array->used_count can be smaller than array->count.
This patch fixes the loop in opt_range.cc, in get_func_mm_tree(),
to access only properly initialized elements in in_vector::base,
preventing access to its bzero'ed non-initialized tail.
Part #2: make tree_or(tree1, tree2) to reuse tree1 for the result object
for simple cases. These include key IN (c1, ... cN).
The reuse was happening in old MySQL versions, but we stopped doing it
in the "fair choice between range and index_merge" patch.
A partial backport of 67f21fb3a077dedfd14b9ca720e926c55e682f93,
Bug#22283790: RANGE OPTIMIZER UTILIZES TOO MUCH MEMORY WITH MANY OR CONDITIONS
The backported part changes SEL_TREE::keys from being an array of
MAX_KEY elements (64*8=512 bytes) to a Mem_root_array<SEL_ARG*> (32 bytes +
alloc'ed array of as many elements as we need).
The patch doesn't fix the "not limiting memory" part, but the memory usage
is much lower with it.
filesort and init_read_record() for the same table.
This will simplify code for WINDOW FUNCTIONS (MDEV-6115)
- Filesort_info renamed to SORT_INFO and moved to filesort.h
- filesort now returns SORT_INFO
- init_read_record() now takes a SORT_INFO parameter.
- unique declaration is moved to uniques.h
- subselect caching of buffers is now more explicit than before
- filesort_buffer is now reusable even if rec_length has changed.
- filsort_free_buffers() and free_io_cache() calls are removed
- Remove one malloc() when using get_addon_fields()
Other things:
- Added --debug-assert-on-not-freed-memory option to make it easier to
debug some not-freed-memory issues.
"Re-factor the code for post-join operations".
The patch mainly contains the code ported from mysql-5.6 and
created for two essential architectural changes:
1. WL#5558: Resolve ORDER BY execution method at the optimization stage
2. WL#6071: Inline tmp tables into the nested loops algorithm
The first task was implemented for mysql-5.6 by Ole John Aske.
It allows to make all decisions on ORDER BY operation at the optimization
stage.
The second task implemented for mysql-5.6 by Evgeny Potemkin adds JOIN_TAB
nodes for post-join operations that require temporary tables. It allows
to execute these operations within the nested loops algorithm that used to
be used before this task only for join queries. Besides these task moves
all planning on the execution of these operations from the execution phase
to the optimization phase.
Some other re-factoring changes of mysql-5.6 were pulled in, mainly because
it was easier to pull them in than roll them back. In particular all
changes concerning Ref_ptr_array were incorporated.
The port required some changes in the MariaDB code that concerned the
functionality of EXPLAIN and ANALYZE. This was done mainly by Sergey
Petrunia.
create_partition_index_description() had wrong logic to calculate
length of the key value buffer that is used by the range optimizer.
For some reason it used MAX(partitioning_columns_len,
subpartitioning_columns_len) while it should use SUM of these values.