Other usage if persistent statistics is checking 'stats_is_read' in
caller, which is why this was not noticed earlier.
Other things:
- Simplified no_stat_values_provided
Fixed missing initialization of Alter_info()
This could cause crashes in some create table like scenarios
where some generated indexes where automatically dropped.
I also added a test that we do not try to drop from index_stats for
temporary tables.
The intentention was always to not create histograms for single value
unique keys (as histograms is not useful in this case), but because of
a bug in the code this was still done.
The changes in the test cases was mainly because hist_size is now NULL
for these kind of columns.
The problem was that sometimes InnoDB returned sligtly wrong record count
for table, which causes the optimizer to disregard the result from
the range optimizer. The end result was that the optimizer choosed a
ref access instead of a range access which caused errors in buildbot.
Fixed by adding more rows to the table to ensure that table scan is
more costly than range scan of the given interval.
The problem was that we did not handle errors properly in
JOIN::get_best_combination. In case an early error, JOIN->join_tab would
contain unintialized values, which would cause errors on cleanup().
The error in question was reported earlier, but not noticed until later.
One cause of this is that most of the sql_select.cc code just checks
thd->fatal_error and not thd->is_error().
Fixed by changing of checks of fatal_error to is_error().
This allows a user to to change the default value of MAX_SEL_ARGS (16000)
in the rare case where they neeed more generated SEL_ARGS (as part of
the range optimizer)
Raise notes if indexes cannot be used:
- in case of data type or collation mismatch (diferent error messages).
- in case if a table field was replaced to something else
(e.g. Item_func_conv_charset) during a condition rewrite.
Added option to write warnings and notes to the slow query log for
slow queries.
New variables added/changed:
- note_verbosity, with is a set of the following options:
basic - All old notes
unusable_keys - Print warnings about keys that cannot be used
for select, delete or update.
explain - Print unusable_keys warnings for EXPLAIN querys.
The default is 'basic,explain'. This means that for old installations
the only notable new behavior is that one will get notes about
unusable keys when one does an EXPLAIN for a query. One can turn all
of all notes by either setting note_verbosity to "" or setting sql_notes=0.
- log_slow_verbosity has a new option 'warnings'. If this is set
then warnings and notes generated are printed in the slow query log
(up to log_slow_max_warnings times per statement).
- log_slow_max_warnings - Max number of warnings written to
slow query log.
Other things:
- One can now use =ALL for any 'set' variable to set all options at once.
For example using "note_verbosity=ALL" in a config file or
"SET @@note_verbosity=ALL' in SQL.
- mysqldump will in the future use @@note_verbosity=""' instead of
@sql_notes=0 to disable notes.
- Added "enum class Data_type_compatibility" and changing the return type
of all Field::can_optimize*() methods from "bool" to this new data type.
Reviewer & Co-author: Alexander Barkov <bar@mariadb.com>
- The code that prints out the notes comes mainly from Alexander
The warning is given in case of table not found or if there is a lock
timeout. The warning is needed as in case of a lock timeout then the
persistent table stats are going to be wrong.
Example of what causes the problem:
T1: ANALYZE TABLE starts to collect statistics
T2: ALTER TABLE starts by deleting statistics for all changed fields,
then creates a temp table and copies data to it.
T1: ANALYZE ends and writes to the statistics tables.
T2: ALTER TABLE renames temp table in place of the old table.
Now the statistics from analyze matches the old deleted tables.
Fixed by waiting to delete old statistics until ALTER TABLE is
the only one using the old table and ensure that rename of columns
can handle swapping of column names.
rename_columns_in_stat_table() (former rename_column_in_stat_tables())
now takes a list of columns to rename. It uses the following algorithm
to update column_stats to be able to handle circular renames
- While there are columns to be renamed and it is the first loop or
last rename loop did change something.
- Loop over all columns to be renamed
- Change column name in column_stat
- If fail because of duplicate key
- If this is first change attempt for this column
- Change column name to a temporary column name
- If there was a conflicting row, replace it with the current row.
else
- Remove entry from column list
- Loop over all remaining columns in the list
- Remove the conflicting row
- Change column from temporary name to final name in column_stat
Other things:
- Don't flush tables for every operation. Only flush when all updates
are done.
- Rename of columns was not handled in case of ALGORITHM=copy (old bug).
- Fixed that we do not collect statistics for hidden hash columns
used by UNIQUE constraint on long values.
- Fixed that we do not collect statistics for blob columns referred by
generated virtual columns. This was achieved by storing the fields for
which we want to have statistics in table->has_value_set instead of
in table->read_set.
- Rename of indexes was not handled for persistent statistics.
- This is now handled similar as rename of columns. Renamed columns
are now stored in 'rename_stat_indexes' and handled in
Alter_info::delete_statistics() together with drooped indexes.
- ALTER TABLE .. ADD INDEX may instead of creating a new index rename
an existing generated foreign key index. This was not reflected in
the index_stats table because this was handled in
mysql_prepare_create_table instead instead of in the mysql_alter() code.
Fixed by adding a call in mysql_prepare_create_table() to drop the
changed index.
I also had to change the code that 'marked the index' to be ignored
with code that would not destroy the original index name.
Reviewer: Sergei Petrunia <sergey@mariadb.com>
When resolving a column from the HAVING clause, a new Item_field
object may be created inside Item_ref::fix_fields().
But the object is created with an empty name resolution context,
which then leads to debug assertion failure during
Item_field::fix_fields().
The solution is to pass the correct name resolution context
when creating the Item_field object.
Reviewer: Oleksandr Byelkin (sanja@mariadb.com)
On creation of a VIEW that depends on a stored routine an instance of
the class Item_func_sp is allocated on a memory root of SP statement.
It happens since mysql_make_view() calls the method
THD::activate_stmt_arena_if_needed()
before parsing definition of the view.
On the other hand, when sp_head's rcontext is created an instance of
the class Field referenced by the data member
Item_func_sp::result_field
is allocated on the Item_func_sp's Query_arena (call arena) that set up
inside the method
Item_sp::execute_impl
just before calling the method
sp_head::execute_function()
On return from the method sp_head::execute_function() all items allocated
on the Item_func_sp's Query_arena are released and its memory root is freed
(see implementation of the method Item_sp::execute_impl). As a consequence,
the pointer
Item_func_sp::result_field
references to the deallocated memory. Later, when the method
sp_head::execute
cleans up items allocated for just executed SP instruction the method
Item_func_sp::cleanup is invoked and tries to delete an object referenced
by data member Item_func_sp::result_field that points to already deallocated
memory, that results in a server abnormal termination.
To fix the issue the current active arena shouldn't be switched to
a statement arena inside the function mysql_make_view() that invoked indirectly
by the method sp_head::rcontext_create. It is implemented by introducing
the new Query_arena's state STMT_SP_QUERY_ARGUMENTS that is set when explicit
Query_arena is created for placing SP arguments and other caller's side items
used during SP execution. Then the method THD::activate_stmt_arena_if_needed()
checks Query_arena's state and returns immediately without switching to
statement's arena.
Fixed tests:
main.order_by_pack_big - disabled view-protocol for some queries
because the view is created with wrong column name if column
name > 64 symbols
Remove TLSv1.1 from the default tls_version system variable.
Output a warning if TLSv1.0 or TLSv1.1 are selected.
Thanks Tingyao Nian for the feature request.
* version_compile_os can be "linux-systemd", not equal to "Linux"
* main.no-threads forces no-threads scheduler, a check whether it
has one_thread_per_connection is guaranteed to fail.
recalculate long unique hash in Write_rows_log_event
and Update_rows_log_event.
normally generated columns (stored and indexed virtual)
are deterministic and their values don't need to be recalculated
on the slave as they're already present in the row image.
but the long unique hash function was changed in MDEV-27653,
so a row event from the old master will have the old hash,
but a table created on the new slave will need a new hash.
32 bit MariaDB crashed in innodb.innodb-16k and a few other tests.
Fixed by using correct sizeof() calls.
Histograms where not read if first read was without histograms.
The problem is that s390x is not using the default bzip library we use
on other platforms, which causes compressed string lengths to be differnt
than what mtr tests expects.
Fixed by:
- Added have_normal_bzip.inc, which checks if compress() returns the
expected length.
- Adjust the results to match the expected one
- main.func_compress.test & archive.archive
- Don't print lengths that depends on compression library
- mysqlbinlog compress tests & connect.zip
- Don't print DATA_LENGTH for SET column_compression_zlib_level=1
- main.column_compression
Summary:
This patch enables possible index optimization when
the WHERE clause has an IN condition of the form:
signed_or_unsigned_column IN (signed_or_unsigned_constant,
signed_or_unsigned_constant
[,signed_or_unsigned_constant]*)
when the IN list constants are of different signess, e.g.:
WHERE signed_column IN (signed_constant, unsigned_constant ...)
WHERE unsigned_column IN (signed_constant, unsigned_constant ...)
Details:
In a condition like:
WHERE unsigned_predicant IN (1, LONGLONG_MAX + 1)
comparison handlers for individual (predicant,value) pairs are
calculated as follows:
* unsigned_predicant and 1 produce &type_handler_newdecimal
* unsigned_predicant and (LONGLONG_MAX + 1) produce &type_handler_slonglong
The old code decided that it could not use bisection because
the two pairs had different comparison handlers.
As a result, bisection was not allowed, and, in case of
an indexed integer column predicant the index on the column was not used.
The new code catches special cases like:
signed_predicant IN (signed_constant, unsigned_constant)
unsigned_predicant IN (signed_constant, unsigned_constant)
It enables bisection using in_longlong, which supports a mixture
of predicant and values of different signess.
In case when the predicant is an indexed column this change
automatically enables index range optimization.
Thanks to Vicențiu Ciorbaru for proposing the idea and for preparing MTR tests.
This commits enables reloading of engine-independent statistics
without flushing the table from table definition cache.
This is achieved by allowing multiple version of the
TABLE_STATISTICS_CB object and having independent pointers to it in
TABLE and TABLE_SHARE. The TABLE_STATISTICS_CB object have reference
pointers and are freed when no one is pointing to it anymore.
TABLE's TABLE_STATISTICS_CB pointer is updated to use the
TABLE_SHARE's pointer when read_statistics_for_tables() is called at
the beginning of a query.
Main changes:
- read_statistics_for_table() will allocate an new TABLE_STATISTICS_CB
object.
- All get_stat_values() functions has a new parameter that tells
where collected data should be stored. get_stat_values() are not
using the table_field object anymore to store data.
- All get_stat_values() functions returns 1 if they found any
data in the statistics tables.
Other things:
- Fixed INSERT DELAYED to not read statistics tables.
- Removed Statistics_state from TABLE_STATISTICS_CB as this is not
needed anymore as wer are not changing TABLE_SHARE->stats_cb while
calculating or loading statistics.
- Store values used with store_from_statistical_minmax_field() in
TABLE_STATISTICS_CB::mem_root. This allowed me to remove the function
delete_stat_values_for_table_share().
- Field_blob::store_from_statistical_minmax_field() is implemented
but is not normally used as we do not yet support EIS statistics
for blobs. For example Field_blob::update_min() and
Field_blob::update_max() are not implemented.
Note that the function can be called if there is an concurrent
"ALTER TABLE MODIFY field BLOB" running because of a bug in
ALTER TABLE where it deletes entries from column_stats
before it has an exclusive lock on the table.
- Use result of field->val_str(&val) as a pointer to the result
instead of val (safetly fix).
- Allocate memory for collected statistics in THD::mem_root, not in
in TABLE::mem_root. This could cause the TABLE object to grow if a
ANALYZE TABLE was run many times on the same table.
This was done in allocate_statistics_for_table(),
create_min_max_statistical_fields_for_table() and
create_min_max_statistical_fields_for_table_share().
- Store in TABLE_STATISTICS_CB::stats_available which statistics was
found in the statistics tables.
- Removed index_table from class Index_prefix_calc as it was not used.
- Added TABLE_SHARE::LOCK_statistics to ensure we don't load EITS
in parallel. First thread will load it, others will reuse the
loaded data.
- Eliminate read_histograms_for_table(). The loading happens within
read_statistics_for_tables() if histograms are needed.
One downside is that if we have read statistics without histograms
before and someone requires histograms, we have to read all statistics
again (once) from the statistics tables.
A smaller downside is the need to call alloc_root() for each
individual histogram. Before we could allocate all the space for
histograms with a single alloc_root.
- Fixed bug in MyISAM and Aria where they did not properly notice
that table had changed after analyze table. This was not a problem
before this patch as then the MyISAM and Aria tables where flushed
as part of ANALYZE table which did hide this issue.
- Fixed a bug in ANALYZE table where table->records could be seen as 0
in collect_statistics_for_table(). The effect of this unlikely bug
was that a full table scan could be done even if
analyze_sample_percentage was not set to 1.
- Changed multiple mallocs in a row to use multi_alloc_root().
- Added a mutex protection in update_statistics_for_table() to ensure
that several tables are not updating the statistics at the same time.
Some of the changes in sql_statistics.cc are based on a patch from
Oleg Smirnov <olernov@gmail.com>
Co-authored-by: Oleg Smirnov <olernov@gmail.com>
Co-authored-by: Vicentiu Ciorbaru <cvicentiu@gmail.com>
Reviewer: Sergei Petrunia <sergey@mariadb.com>
The problem is that the first execution of the prepared statement makes
a permanent optimization of converting the LEFT JOIN to an INNER JOIN.
This is based on the assumption that all the user parameters (?) are
always constants and that parameters to Item_cond() will not change value
from true and false between different executions.
(The example was using IS NULL, which will change value if parameter
depending on if the parameter is NULL or not).
The fix is to change Item_cond::fix_fields() and
Item_cond::eval_not_null_tables() to not threat user parameters as
constants. This will ensure that we don't do the LEFT_JOIN -> INNER
JOIN conversion that causes problems.
There is also some things that needs to be improved regarding
calculations of not_null_tables_cache as we get a different value for
WHERE 1 or t1.a=1
compared to
WHERE t1.a= or 1
Changes done:
- Mark Item_param with the PARAM flag to be able to quickly check
in Item_cond::eval_not_null_tables() if an item contains a
prepared statement parameter (just like we check for stored procedure
parameters).
- Fixed that Item_cond::not_null_tables_cache is not depending on
order of arguments.
- Don't call item->eval_const_cond() for items that are NOT on the top
level of the WHERE clause. This removed a lot of unnecessary
warnings in the test suite!
- Do not reset not_null_tables_cache for not top level items.
- Simplified Item_cond::fix_fields by calling eval_not_null_tables()
instead of having duplication of all the code in
eval_not_null_tables().
- Return an error if Item_cond::fix_field() generates an error
The old code did generate an error in some cases, but not in all
cases.
- Fixed all handling of the above error in make_cond_for_tables().
The error handling by the callers did not exists before which
could lead to asserts in many different places in the old code).
- All changes in sql_select.cc are just checking the return value of
fix_fields() and make_cond_for_tables() and returning an error
value if fix_fields() returns true or make_cond_for_tables()
returns NULL and is_error() is set.
- Mark Item_cond as const_item if all arguments returns true for
can_eval_in_optimize().
Reviewer: Sergei Petrunia <sergey@mariadb.com>
Field_varstring::get_copy_func() did not take into account
that functions do_varstring1[_mb], do_varstring2[_mb] do not support
compressed data.
Changing the return value of Field_varstring::get_copy_func()
to `do_field_string` if there is a compresion and truncation
at the same time. This fixes the problem, so now it works as follows:
- val_str() uncompresses the data
- The prefix is then calculated on the uncompressed data
Additionally, introducing two new copying functions
- do_varstring1_no_truncation()
- do_varstring2_no_truncation()
Using new copying functions in cases when:
- a Field_varstring with length_bytes==1 is changing to a longer
Field_varstring with length_bytes==1
- a Field_varstring with length_bytes==2 is changing to a longer
Field_varstring with length_bytes==2
In these cases we don't care neither of compression nor
of multi-byte prefixes: the entire data gets fully copied
from the source column to the target column as is.
This is a kind of new optimization, but this also was needed
to preserve existing MTR test results.