For EITS collection min and max fields are allocated for each column
that is set in the read_set bitmap of a table. This allocation of min and max
fields happens inside alloc_statistics_for_table.
For a partitioned table ha_rnd_init is called inside the function
collect_statistics_for_table which sets the read_set bitmap for the columns
inside the partition expression. This happens only when there is a write lock
on the partitioned table.
But the allocation happens before this, so min and max fields are not allocated
for the columns involved in the partition expression.
This resulted in a crash, as the EITS statistics were collected but there was
no min and max field to store the value to.
The fix would be to call ha_rnd_init inside the function alloc_statistics_for_table
that would make sure that min and max fields are allocated for the columns
involved in the partition expression.
The assertion failed in handler::ha_reset upon SELECT under
READ UNCOMMITTED from table with index on virtual column.
This was the debug-only failure, though the problem is mush wider:
* MY_BITMAP is a structure containing my_bitmap_map, the latter is a raw
bitmap.
* read_set, write_set and vcol_set of TABLE are the pointers to MY_BITMAP
* The rest of MY_BITMAPs are stored in TABLE and TABLE_SHARE
* The pointers to the stored MY_BITMAPs, like orig_read_set etc, and
sometimes all_set and tmp_set, are assigned to the pointers.
* Sometimes tmp_use_all_columns is used to substitute the raw bitmap
directly with all_set.bitmap
* Sometimes even bitmaps are directly modified, like in
TABLE::update_virtual_field(): bitmap_clear_all(&tmp_set) is called.
The last three bullets in the list, when used together (which is mostly
always) make the program flow cumbersome and impossible to follow,
notwithstanding the errors they cause, like this MDEV-17556, where tmp_set
pointer was assigned to read_set, write_set and vcol_set, then its bitmap
was substituted with all_set.bitmap by dbug_tmp_use_all_columns() call,
and then bitmap_clear_all(&tmp_set) was applied to all this.
To untangle this knot, the rule should be applied:
* Never substitute bitmaps! This patch is about this.
orig_*, all_set bitmaps are never substituted already.
This patch changes the following function prototypes:
* tmp_use_all_columns, dbug_tmp_use_all_columns
to accept MY_BITMAP** and to return MY_BITMAP * instead of my_bitmap_map*
* tmp_restore_column_map, dbug_tmp_restore_column_maps to accept
MY_BITMAP* instead of my_bitmap_map*
These functions now will substitute read_set/write_set/vcol_set directly,
and won't touch underlying bitmaps.
An oveflow was happening on windows because on Windows sizeof(ulong) is 4 bytes
while it is 8 bytes on Linux.
Switched avg_frequency and avg length for column statistics to ulonglong.
Switched avg_frequency for index statistics to ulonglong.
Previously multiple threads were allowed to load histograms concurrently.
There were no known problems caused by this. But given amount of data
races in this code, it'd happen sooner or later.
To avoid scalability bottleneck, histograms loading is protected by
per-TABLE_SHARE atomic variable.
Whenever histograms were loaded by preceding statement (hot-path), a
scalable load-acquire check is performed.
Whenever histograms have to be loaded anew, mutual exclusion for loaders
is established by atomic variable. If histograms are being loaded
concurrently, statement waits until load is completed.
- Table_statistics::total_hist_size moved to TABLE_STATISTICS_CB: only
meaningful within TABLE_SHARE (not used for collected stats).
- TABLE_STATISTICS_CB::histograms_can_be_read and
TABLE_STATISTICS_CB::histograms_are_read are replaced with a tri state
atomic variable.
- Simplified away alloc_histograms_for_table_share().
Note: there's still likely a data race if a thread attempts accessing
histograms data after it failed to load it (because of concurrent load).
It was there previously and goes out of the scope of this effort. One way
of fixing it could be reviving TABLE::histograms_are_read and adding
appropriate checks whenever it is needed.
Part of MDEV-19061 - table_share used for reading statistical tables is
not protected
Previously multiple threads were allowed to load statistics concurrently.
There were no known problems caused by this. But given amount of data
races in this code, it'd happen sooner or later.
To avoid scalability bottleneck, statistics loading is protected by
per-TABLE_SHARE atomic variable.
Whenever statistics were loaded by preceding statement (hot-path), a
scalable load-acquire check is performed.
Whenever statistics have to be loaded anew, mutual exclusion for loaders
is established by atomic variable. If statistics are being loaded
concurrently, statement waits until load is completed.
TABLE_STATISTICS_CB::stats_can_be_read and
TABLE_STATISTICS_CB::stats_is_read are replaced with a tri state atomic
variable.
Part of MDEV-19061 - table_share used for reading statistical tables is
not protected
Removed redundant loops, integrated logics into the caller instead.
Unified condition in read_statistics_for_tables(), less
"table_share != NULL" checks, no more potential "table_share == NULL"
dereferencing.
Part of MDEV-19061 - table_share used for reading statistical tables is
not protected
MDEV-22073 MSAN use-of-uninitialized-value in
collect_statistics_for_table()
Other things:
innodb.analyze_table was changed to mainly test statistic
collection. Was discussed with Marko.
* use compile_time_assert instead of DBUG_ASSERT
* don't use thd->clear_error(), because
* the error was already consumed by the error handler, so there is
nothing to clear
* it's dangerous to clear errors indiscriminately, if the error came
from outside of read_statistics_for_tables() it must not be cleared
mysql_insert() first opens all affected tables (which implicitly
starts a transaction in InnoDB), then stat tables.
A failure to open a stat table caused open_tables() to abort
the current stmt transaction (trans_rollback_stmt()). So, from the
server point of view the following ha_write_row()-s happened outside
of a transactions, and the server didn't bother to commit them.
The server has a mechanism to prevent a transaction being
unexpectedly committed or rolled back in the middle of a statement -
if an operation takes place _in a sub-statement_ it cannot change
the transaction state. Operations on stat tables are exactly that -
they are not allowed to change a transaction state. Put them in
a sub-statement to make sure they don't.
read_statistics_for_tables_if_needed
Regression after 279a907, read_statistics_for_tables_if_needed() was
called after open_normal_and_derived_tables() failure.
Fixed by moving read_statistics_for_tables() call to a branch of
get_schema_stat_record() where result of open_normal_and_derived_tables()
is checked.
Removed THD::force_read_stats, added read_statistics_for_tables() instead.
Simplified away statistics_for_command_is_needed().
For CMAKE_BUILD_TYPE=Debug, the default MYSQL_MAINTAINER_MODE=AUTO
implies -Werror along with other flags in cmake/maintainer.cmake,
which would break the debug builds when CMAKE_CXX_FLAGS include -O2.
This fix includes a backport of 6dd3f24090
from MariaDB 10.3.
The flag is_stat_field is not set for the min_value and max_value of field items
inside table share. This is a must requirement as we don't want to throw
warnings of truncation when we read values from the statistics table to the column
statistics of table share fields.
For MDEV-15955, the fix in create_tmp_field_from_item() would cause a
compilation error. After a discussion with Alexander Barkov, the fix
was omitted and only the test case was kept.
In 10.3 and later, MDEV-15955 is fixed properly by overriding
create_tmp_field() in Item_func_user_var.
A histogram size that is odd in size with DOUBLE precision will leave the last
byte unwritten. When collecting histograms, this causes the last byte to
be uninitialized in the record. memset the buffer to 0 first to make
sure this does not happen.
To read histograms for a table, we should check if the allocation of statistics was done or not,
if not done we should not try to read histograms for such a table.
The command SHOW INDEXES ignored setting of the system variable
use_stat_tables to the value of 'preferably' and and showed statistical
data received from the engine. Similarly queries over the table
STATISTICS from INFORMATION_SCHEMA ignored this setting. It happened
because the function fill_schema_table_by_open() did not read any data
from statistical tables.
For single table updates and multi-table updates , engine independent statistics were not being
read even if the statistics were collected.
Fixed it, so when the optimizer_use_condition_selectivity > 2 then we would read the available
statistics for update queries.
To fix the crash there we need to make sure that the
server while storing the statistical values in statistical tables should do it
in a multi-byte safe way.
Also there is no need to throw warnings if there is truncation while storing
values from statistical fields.
Also, apply the MDEV-17957 changes to encrypted page checksums,
and remove error message output from the checksum function,
because these messages would be useless noise when mariabackup
is retrying reads of corrupted-looking pages, and not that
useful during normal server operation either.
The error messages in fil_space_verify_crypt_checksum()
should be refactored separately.
The problem here is EITS statistics does not calculate statistics for the partitions of the table.
So a temporary solution would be to not read EITS statistics for partitioned tables.
Also disabling reading of EITS for columns that participate in the partition list of a table.
for blob column
ANALYZE TABLE <table> does not collect statistical data on min/max values
for BLOB columns of <table>. However these values can be added into
mysql.column_stats manually by executing proper statements.
Unfortunately this led to a memory leak because the memory allocated
for these values was never freed.
This patch provides the server with a function to free memory allocated
for min/max statistical values of BLOB types.
Temporarily changed the test case until MDEV-16711 is fixed as without
this fix the test case for MDEV-16757 did not fail only for 10.0.
for blob column
ANALYZE TABLE <table> does not collect statistical data on min/max values
for BLOB columns of <table>. However these values can be added into
mysql.column_stats manually by executing proper statements.
Unfortunately this led to a memory leak because the memory allocated
for these values was never freed.
This patch provides the server with a function to free memory allocated
for min/max statistical values of BLOB types.
Backport the fix f214d36512 to 10.0
Author: Sergei Golubchik <serg@mariadb.org>
Date: Tue Apr 17 00:44:34 2018 +0200
ASAN error in is_stat_table()
don't memcmp beyond the first argument's end
Also: use my_strcasecmp(table_alias_charset), like elsewhere, not memcmp