Enable unusable key notes for non-equality predicates:
<, <=, =>, >, BETWEEN, IN, LIKE
Note, in some scenarios it displays duplicate notes, e.g.
for queries with ORDER BY:
SELECT * FROM t1
WHERE indexed_string_column >= 10
ORDER BY indexed_string_column
LIMIT 5;
This should be tolarable. Getting rid of the diplicate note
completely would need a much more complex patch, which is
not desiable in 10.6.
Details:
- Changing RANGE_OPT_PARAM::note_unusable_keys from bool
to a new data type Item_func::Bitmap, so the caller can
choose with a better granuality which predicates
should raise unusable key notes inside the range optimizer:
a. all predicates (=, <=>, <, <=, =>, >, BETWEEN, IN, LIKE)
b. all predicates except equality (=, <=>)
c. none of the predicates
"b." is needed because in some scenarios equality predicates (=, <=>)
send unusable key notes at an earlier stage, before the range optimizer,
during update_ref_and_keys(). Calling the range optimizer with
"all predicates" would produce duplicate notes for = and <=> in such cases.
- Fixing get_quick_record_count() to call the range optimizer
with "all predicates except equality" instead of "none of the predicates".
Before this change the range optimizer suppressed all notes for
non-equality predicates: <, <=, =>, >, BETWEEN, IN, LIKE.
This actually fixes the reported problem.
- Fixing JOIN::make_range_rowid_filters() to call the range optimizer
with "all predicates except equality" instead of "all predicates".
Before this change the range optimizer produced duplicate notes
for = and <=> during a rowid_filter optimization.
- Cleanup:
Adding the op_collation argument to Field::raise_note_cannot_use_key_part()
and displaying the operation collation rather than the argument collation
in the unusable key note. This is important for operations with more than
two arguments: BETWEEN and IN, e.g.:
SELECT * FROM t1
WHERE column_utf8mb3_general_ci
BETWEEN 'a' AND 'b' COLLATE utf8mb3_unicode_ci;
SELECT * FROM t1
WHERE column_utf8mb3_general_ci
IN ('a', 'b' COLLATE utf8mb3_unicode_ci);
The note for 'a' now prints utf8mb3_unicode_ci as the collation.
which is the collation of the entire operation:
Cannot use key key1 part[0] for lookup:
"`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >=
"'a'" of collation `utf8mb3_unicode_ci`
Before this change it printed the collation of 'a',
so the note was confusing:
Cannot use key key1 part[0] for lookup:
"`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >=
"'a'" of collation `utf8mb3_general_ci`"
(Variant#3: Allow cross-charset comparisons, use a special
CHARSET_INFO to create lookup keys. Review input addressed.)
Equalities that compare utf8mb{3,4}_general_ci strings, like:
WHERE ... utf8mb3_key_col=utf8mb4_value (MB3-4-CMP)
can now be used to construct ref[const] access and also participate
in multiple-equalities.
This means that utf8mb3_key_col can be used for key-lookups when
compared with an utf8mb4 constant, field or expression using '=' or
'<=>' comparison operators.
This is controlled by optimizer_switch='cset_narrowing=on', which is
OFF by default.
IMPLEMENTATION
Item value comparison in (MB3-4-CMP) is done using utf8mb4_general_ci.
This is valid as any utf8mb3 value is also an utf8mb4 value.
When making index lookup value for utf8mb3_key_col, we do "Charset
Narrowing": characters that are in the Basic Multilingual Plane (=BMP) are
copied as-is, as they can be represented in utf8mb3. Characters that are
outside the BMP cannot be represented in utf8mb3 and are replaced
with U+FFFD, the "Replacement Character".
In utf8mb4_general_ci, the Replacement Character compares as equal to any
character that's not in BMP. Because of this, the constructed lookup value
will find all index records that would be considered equal by the original
condition (MB3-4-CMP).
Approved-by: Monty <monty@mariadb.org>
The MDEV-29693 conflict resolution is from Monty, as well as is
a bug fix where ANALYZE TABLE wrongly built histograms for
single-column PRIMARY KEY.
Also includes a fix for safe_malloc error reporting.
Other things:
- Copied main.log_slow from 10.4 to avoid mtr issue
Disabled test:
- spider/bugfix.mdev_27239 because we started to get
+Error 1429 Unable to connect to foreign data source: localhost
-Error 1158 Got an error reading communication packets
- main.delayed
- Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED
This part is disabled for now as it fails randomly with different
warnings/errors (no corruption).
The crash inside my_vsnprintf_utf32() happened correctly,
because the caller methods:
Field_string::sql_rpl_type()
Field_varstring::sql_rpl_type()
mis-used the charset library and sent pure ASCII data to the
virtual function snprintf() of a utf32 CHARSET_INFO.
It was wrong to use Field::charset() in sql_rpl_type().
We're printing the metadata (the data type) here, not the column data.
The string contraining the data type of a CHAR/VARCHAR column
is a pure ASCII string.
Fixing to use res->charset() to print, like all virtual implementations
of sql_type() do.
Review was done by Andrei Elkin.
Thanks to Andrei for proposing MTR test improvents.
There where several reasons why the test failed:
- Constructors for Field_double and Field_float changed an argument
to the constructor instead of a the correct class variable.
- gcc 7.5.0 produced wrong code when inlining Field_double constructor
into Field_test_double constructor.
Fixed by changing the correct class variable and make the constructors
not inline to go around the gcc bug.
Raise notes if indexes cannot be used:
- in case of data type or collation mismatch (diferent error messages).
- in case if a table field was replaced to something else
(e.g. Item_func_conv_charset) during a condition rewrite.
Added option to write warnings and notes to the slow query log for
slow queries.
New variables added/changed:
- note_verbosity, with is a set of the following options:
basic - All old notes
unusable_keys - Print warnings about keys that cannot be used
for select, delete or update.
explain - Print unusable_keys warnings for EXPLAIN querys.
The default is 'basic,explain'. This means that for old installations
the only notable new behavior is that one will get notes about
unusable keys when one does an EXPLAIN for a query. One can turn all
of all notes by either setting note_verbosity to "" or setting sql_notes=0.
- log_slow_verbosity has a new option 'warnings'. If this is set
then warnings and notes generated are printed in the slow query log
(up to log_slow_max_warnings times per statement).
- log_slow_max_warnings - Max number of warnings written to
slow query log.
Other things:
- One can now use =ALL for any 'set' variable to set all options at once.
For example using "note_verbosity=ALL" in a config file or
"SET @@note_verbosity=ALL' in SQL.
- mysqldump will in the future use @@note_verbosity=""' instead of
@sql_notes=0 to disable notes.
- Added "enum class Data_type_compatibility" and changing the return type
of all Field::can_optimize*() methods from "bool" to this new data type.
Reviewer & Co-author: Alexander Barkov <bar@mariadb.com>
- The code that prints out the notes comes mainly from Alexander
remove old deprecation helpers that were not used anywhere.
create new deprecation helpers and enforce their usage
this also removes inconsistencies in reporting deprecation:
sometimes it was ER_WARN_DEPRECATED_SYNTAX (1287),
sometimes ER_WARN_DEPRECATED_SYNTAX_NO_REPLACEMENT (1681),
sometimes a warning, sometimes a note.
it should always be
* ER_WARN_DEPRECATED_SYNTAX
* a warning (because it's something actionable, not purely informational)
There are two functions to extract a Field::val_str() value
as a LEX_STRING or LEX_CSTRING pointing to the data allocated on a MEM_ROOT:
char *get_field(MEM_ROOT *mem, Field *field);
bool get_field(MEM_ROOT *mem, Field *field, class String *res);
The first function requires strlen() calls to make a LEX_CSTRING/LEX_STRING.
The second function requires a redundant String buffer,
which is used only as a temporary proxy value pointing to a MEM_ROOT fragment
(and does not use any String dynamic allocation methods).
This patch add a native way to extract a Field::val_str() value
as a LEX_STRING or LEX_CSTRING pointing to a MEM_ROOT fragment.
It helps to remove redundant strlen() calls and redundant String buffers.
- Adding a new method:
LEX_STRING Field::val_lex_string_strmake(MEM_ROOT *mem);
- Reusing the new method Field::val_lex_string_strmake() in;
bool get_field(MEM_ROOT *mem, Field *field, String *res);
Also, moving it from table.cc to a static function in sql_help.cc.
It is used in sql_help.cc only, and we don't want it to be reused
in other parts of the code (to avoid redundant String buffers).
- Reusing the new method Field::val_lex_string_strmake() in this function:
char *get_field(MEM_ROOT *mem, Field *field);
- Replacing get_field() to Field::val_lex_string_strmake() in these files:
sql_plugin.cc (redundant String buffers were removed)
sql_udf.cc (redundant strlen() calls were removed)
Note, this function:
char *get_field(MEM_ROOT *mem, Field *field);
is still used in a number of files:
event_data_objects.cc
event_db_repository.cc
sql_acl.cc
sql_servers.cc
These remaining calls will be removed by separate patches,
and get_field() will be removed after that.
This commits enables reloading of engine-independent statistics
without flushing the table from table definition cache.
This is achieved by allowing multiple version of the
TABLE_STATISTICS_CB object and having independent pointers to it in
TABLE and TABLE_SHARE. The TABLE_STATISTICS_CB object have reference
pointers and are freed when no one is pointing to it anymore.
TABLE's TABLE_STATISTICS_CB pointer is updated to use the
TABLE_SHARE's pointer when read_statistics_for_tables() is called at
the beginning of a query.
Main changes:
- read_statistics_for_table() will allocate an new TABLE_STATISTICS_CB
object.
- All get_stat_values() functions has a new parameter that tells
where collected data should be stored. get_stat_values() are not
using the table_field object anymore to store data.
- All get_stat_values() functions returns 1 if they found any
data in the statistics tables.
Other things:
- Fixed INSERT DELAYED to not read statistics tables.
- Removed Statistics_state from TABLE_STATISTICS_CB as this is not
needed anymore as wer are not changing TABLE_SHARE->stats_cb while
calculating or loading statistics.
- Store values used with store_from_statistical_minmax_field() in
TABLE_STATISTICS_CB::mem_root. This allowed me to remove the function
delete_stat_values_for_table_share().
- Field_blob::store_from_statistical_minmax_field() is implemented
but is not normally used as we do not yet support EIS statistics
for blobs. For example Field_blob::update_min() and
Field_blob::update_max() are not implemented.
Note that the function can be called if there is an concurrent
"ALTER TABLE MODIFY field BLOB" running because of a bug in
ALTER TABLE where it deletes entries from column_stats
before it has an exclusive lock on the table.
- Use result of field->val_str(&val) as a pointer to the result
instead of val (safetly fix).
- Allocate memory for collected statistics in THD::mem_root, not in
in TABLE::mem_root. This could cause the TABLE object to grow if a
ANALYZE TABLE was run many times on the same table.
This was done in allocate_statistics_for_table(),
create_min_max_statistical_fields_for_table() and
create_min_max_statistical_fields_for_table_share().
- Store in TABLE_STATISTICS_CB::stats_available which statistics was
found in the statistics tables.
- Removed index_table from class Index_prefix_calc as it was not used.
- Added TABLE_SHARE::LOCK_statistics to ensure we don't load EITS
in parallel. First thread will load it, others will reuse the
loaded data.
- Eliminate read_histograms_for_table(). The loading happens within
read_statistics_for_tables() if histograms are needed.
One downside is that if we have read statistics without histograms
before and someone requires histograms, we have to read all statistics
again (once) from the statistics tables.
A smaller downside is the need to call alloc_root() for each
individual histogram. Before we could allocate all the space for
histograms with a single alloc_root.
- Fixed bug in MyISAM and Aria where they did not properly notice
that table had changed after analyze table. This was not a problem
before this patch as then the MyISAM and Aria tables where flushed
as part of ANALYZE table which did hide this issue.
- Fixed a bug in ANALYZE table where table->records could be seen as 0
in collect_statistics_for_table(). The effect of this unlikely bug
was that a full table scan could be done even if
analyze_sample_percentage was not set to 1.
- Changed multiple mallocs in a row to use multi_alloc_root().
- Added a mutex protection in update_statistics_for_table() to ensure
that several tables are not updating the statistics at the same time.
Some of the changes in sql_statistics.cc are based on a patch from
Oleg Smirnov <olernov@gmail.com>
Co-authored-by: Oleg Smirnov <olernov@gmail.com>
Co-authored-by: Vicentiu Ciorbaru <cvicentiu@gmail.com>
Reviewer: Sergei Petrunia <sergey@mariadb.com>
MariaRocks is currently lagging behind the main branch of the RocksDB engine. This commit brings MariaRocks up to date with the latest release of RocksDB by backporting changes from Facebook’s MyRocks. These changes include API updates, bug fixes, and improvements for compatibility with RocksDB v8.1.1. Some system variables and metadata tables are modified to reflect the internal changes in RocksDB.
Additionally, this commit backports improved and more stable test cases from Facebook’s MyRocks, including tests for the write_unprepared isolation level of RocksDB. It also reverts workarounds for MDEV-29875 and MDEV-31057 and adds support for the latest compilation options.
The default value of the following system variables are changed:
* rocksdb_stats_level: 1 (kExceptHistogramOrTimers)
* rocksdb_wal_recovery_mode: 2 (kPointInTimeRecovery)
The following system variables are added:
* rocksdb_cancel_manual_compactions
* rocksdb_enable_iterate_bounds
* rocksdb_enable_pipelined_write
* rocksdb_enable_remove_orphaned_dropped_cfs
* rocksdb_manual_compaction_bottommost_level
* rocksdb_max_background_compactions
* rocksdb_max_background_flushes
* rocksdb_max_bottom_pri_background_compactions
* rocksdb_skip_locks_if_skip_unique_check
* rocksdb_track_and_verify_wals_in_manifest
* rocksdb_write_batch_flush_threshold
The following system variables are deprecated:
* rocksdb_hash_index_allow_collision
* rocksdb_new_table_reader_for_compaction_inputs
The following dynamic metadata table is added:
* INFORMATION_SCHEMA.ROCKSDB_LIVE_FILES_METADATA
The following status variables are added:
* rocksdb_manual_compactions_cancelled
* rocksdb_manual_compactions_pending
The following status variables are removed:
* rocksdb_block_cache_filter_bytes_evict
* rocksdb_block_cache_index_bytes_evict
* rocksdb_block_cachecompressed_hit
* rocksdb_block_cachecompressed_miss
* rocksdb_no_file_closes
* rocksdb_num_iterators
* rocksdb_number_deletes_filtered
* rocksdb_write_timedout
it was redundant, duplicating vcol_type == VCOL_GENERATED_STORED.
Note that VCOL_DEFAULT is not "stored", "stored vcol" means that after
rnd_next or index_read/etc the field value is already in the record[0]
and does not need to be calculated separately
ROW variables did not get assigned from subselects in these contexts:
BEGIN
DECLARE r ROW TYPE OF t1;
SET r=(SELECT * FROM t1 WHERE a=1);
END;
BEGIN
DECLARE r ROW TYPE OF t1 DEFAULT (SELECT * FROM t1 WHERE a=1);
END;
All fields of the ROW variable remained NULL.
Derived table creation code would call Field::make_new_field() which would
memcpy the Field object from the source table, including Field::read_stats.
But the temp. table as a whole had table->stats_is_read=false. Which was
correct but not consistent with Field::read_stats and caused an assertion.
Fixed by making sure that Field::read_stats=NULL for fields in the new
temporary (i.e. work) tables.