This commits enables reloading of engine-independent statistics
without flushing the table from table definition cache.
This is achieved by allowing multiple version of the
TABLE_STATISTICS_CB object and having independent pointers to it in
TABLE and TABLE_SHARE. The TABLE_STATISTICS_CB object have reference
pointers and are freed when no one is pointing to it anymore.
TABLE's TABLE_STATISTICS_CB pointer is updated to use the
TABLE_SHARE's pointer when read_statistics_for_tables() is called at
the beginning of a query.
Main changes:
- read_statistics_for_table() will allocate an new TABLE_STATISTICS_CB
object.
- All get_stat_values() functions has a new parameter that tells
where collected data should be stored. get_stat_values() are not
using the table_field object anymore to store data.
- All get_stat_values() functions returns 1 if they found any
data in the statistics tables.
Other things:
- Fixed INSERT DELAYED to not read statistics tables.
- Removed Statistics_state from TABLE_STATISTICS_CB as this is not
needed anymore as wer are not changing TABLE_SHARE->stats_cb while
calculating or loading statistics.
- Store values used with store_from_statistical_minmax_field() in
TABLE_STATISTICS_CB::mem_root. This allowed me to remove the function
delete_stat_values_for_table_share().
- Field_blob::store_from_statistical_minmax_field() is implemented
but is not normally used as we do not yet support EIS statistics
for blobs. For example Field_blob::update_min() and
Field_blob::update_max() are not implemented.
Note that the function can be called if there is an concurrent
"ALTER TABLE MODIFY field BLOB" running because of a bug in
ALTER TABLE where it deletes entries from column_stats
before it has an exclusive lock on the table.
- Use result of field->val_str(&val) as a pointer to the result
instead of val (safetly fix).
- Allocate memory for collected statistics in THD::mem_root, not in
in TABLE::mem_root. This could cause the TABLE object to grow if a
ANALYZE TABLE was run many times on the same table.
This was done in allocate_statistics_for_table(),
create_min_max_statistical_fields_for_table() and
create_min_max_statistical_fields_for_table_share().
- Store in TABLE_STATISTICS_CB::stats_available which statistics was
found in the statistics tables.
- Removed index_table from class Index_prefix_calc as it was not used.
- Added TABLE_SHARE::LOCK_statistics to ensure we don't load EITS
in parallel. First thread will load it, others will reuse the
loaded data.
- Eliminate read_histograms_for_table(). The loading happens within
read_statistics_for_tables() if histograms are needed.
One downside is that if we have read statistics without histograms
before and someone requires histograms, we have to read all statistics
again (once) from the statistics tables.
A smaller downside is the need to call alloc_root() for each
individual histogram. Before we could allocate all the space for
histograms with a single alloc_root.
- Fixed bug in MyISAM and Aria where they did not properly notice
that table had changed after analyze table. This was not a problem
before this patch as then the MyISAM and Aria tables where flushed
as part of ANALYZE table which did hide this issue.
- Fixed a bug in ANALYZE table where table->records could be seen as 0
in collect_statistics_for_table(). The effect of this unlikely bug
was that a full table scan could be done even if
analyze_sample_percentage was not set to 1.
- Changed multiple mallocs in a row to use multi_alloc_root().
- Added a mutex protection in update_statistics_for_table() to ensure
that several tables are not updating the statistics at the same time.
Some of the changes in sql_statistics.cc are based on a patch from
Oleg Smirnov <olernov@gmail.com>
Co-authored-by: Oleg Smirnov <olernov@gmail.com>
Co-authored-by: Vicentiu Ciorbaru <cvicentiu@gmail.com>
Reviewer: Sergei Petrunia <sergey@mariadb.com>
Sergei's commit ac6b3c4430 implemented handler status counters
compensation for underlying handlers like ha_partition.
`index_read_idx_map` is missing there, but it should have been fixed as
well (proof: ha_partition::index_read_idx_map never calls
ha_partition::index_read_map).
Note: all this compensation logic could be broken for subpartitions! (We
can experience double decrement)
A syntax error was reported for any INSERT statement with explicit
partition selection it if i used a column list.
Fixed by saving the parsing place before parsing the clause for explicit
partition selection and restoring it when the clause has been parsed.
Change the defaults:
-histogram_size=0
+histogram_size=254
-histogram_type=SINGLE_PREC_HB
+histogram_type=DOUBLE_PREC_HB
Adjust the testcases:
- Some have ignorable changes in EXPLAIN outputs and
more counter increments due to EITS table reads.
- Testcases that meaningfully depend on the old defaults
are changed to use the old values.
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.
main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.