Commit graph

201861 commits

Author SHA1 Message Date
Sergei Golubchik
d6add9a03d initial support for vector indexes
MDEV-33407 Parser support for vector indexes

The syntax is

  create table t1 (... vector index (v) ...);

limitation:
* v is a binary string and NOT NULL
* only one vector index per table
* temporary tables are not supported

MDEV-33404 Engine-independent indexes: subtable method

added support for so-called "high level indexes", they are not visible
to the storage engine, implemented on the sql level. For every such
an index in a table, say, t1, the server implicitly creates a second
table named, like, t1#i#05 (where "05" is the index number in t1).
This table has a fixed structure, no frm, not accessible directly,
doesn't go into the table cache, needs no MDLs.

MDEV-33406 basic optimizer support for k-NN searches

for a query like SELECT ... ORDER BY func() optimizer will use
item_func->part_of_sortkey() to decide what keys can be used
to resolve ORDER BY.
2024-11-05 14:00:48 -08:00
Sergei Golubchik
9ccf02a9a7 MDEV-32885 VEC_DISTANCE() function 2024-11-05 14:00:48 -08:00
Sergei Golubchik
08a7f18b19 cleanup: init_tmp_table_share(bool thread_specific)
let the caller tell init_tmp_table_share() whether the table
should be thread_specific or not.

In particular, internal tmp tables created in the slave thread
are perfectly thread specific
2024-11-05 14:00:48 -08:00
Sergei Golubchik
44c6328cbb cleanup: thd->alloc<>() and thd->calloc<>()
create templates

  thd->alloc<X>(n) to use instead of (X*)thd->alloc(sizeof(X)*n)

and the same for thd->calloc(). By the default the type is char,
so old usage of thd->alloc(size) works too.
2024-11-05 14:00:48 -08:00
Sergei Golubchik
eff16d7593 Revert "MDEV-15458 Segfault in heap_scan() upon UPDATE after ADD SYSTEM VERSIONING"
This partially reverts 43623f04a9

Engines have to set ::position() after ::write_row(), otherwise
the server won't be able to refer to the row just inserted.
This is important for high-level indexes.

heap part isn't reverted, so heap doesn't support high-level indexes.
to fix this, it'll need info->lastpos in addition to info->current_ptr
2024-11-05 14:00:48 -08:00
Sergei Golubchik
07ec1a9e37 cleanup: unused function argument 2024-11-05 14:00:48 -08:00
Sergei Golubchik
aa09cb3b11 open frm for DROP TABLE
needed to get partitioning and information about
secondary objects
2024-11-05 14:00:48 -08:00
Sergei Golubchik
c1b4f3a32c cleanup: extract ha_create_table_from_share() 2024-11-05 14:00:48 -08:00
Sergei Golubchik
1fe8a1bb76 cleanup: generalize ER_INNODB_NO_FT_TEMP_TABLE 2024-11-05 14:00:48 -08:00
Sergei Golubchik
fd69abe44f cleanup: generalize ER_SPATIAL_CANT_HAVE_NULL 2024-11-05 14:00:48 -08:00
Sergei Golubchik
356baeea4b cleanup: make_long_hash_field_name() and add_hash_field() 2024-11-05 14:00:47 -08:00
Sergei Golubchik
062f8eb37d cleanup: key algorithm vs key flags
the information about index algorithm was stored in two
places inconsistently split between both.

BTREE index could have key->algorithm == HA_KEY_ALG_BTREE, if the user
explicitly specified USING BTREE or HA_KEY_ALG_UNDEF, if not.

RTREE index had key->algorithm == HA_KEY_ALG_RTREE
and always had key->flags & HA_SPATIAL

FULLTEXT index had  key->algorithm == HA_KEY_ALG_FULLTEXT
and always had key->flags & HA_FULLTEXT

HASH index had key->algorithm == HA_KEY_ALG_HASH or HA_KEY_ALG_UNDEF

long unique index always had key->algorithm == HA_KEY_ALG_LONG_HASH

In this commit:

All indexes except BTREE and HASH always have key->algorithm
set, HA_SPATIAL and HA_FULLTEXT flags are not used anymore (except
for storage to keep frms backward compatible).

As a side effect ALTER TABLE now detects FULLTEXT index renames correctly
2024-11-05 14:00:47 -08:00
Sergei Golubchik
f5e9c4e9ef cleanup: Queue and Bounded_queue
Bounded_queue<> pretended to be a typesafe C++ wrapper
on top of pure C queues.h.

But it wasn't, it was tightly bounded to filesort and only useful there.

* implement Queue<> - a typesafe C++ wrapper on top of QUEUE
* move Bounded_queue to filesort.cc, remove pointless "generalizations"
  change it to use Queue.
* remove bounded_queue.h
* change subselect_rowid_merge_engine to use Queue, not QUEUE
2024-11-05 14:00:47 -08:00
Sergei Golubchik
ae59127158 cleanup: lex_string_set3() 2024-11-05 14:00:47 -08:00
Sergei Golubchik
32e6f8ff2e cleanup: remove unconditional #ifdef's 2024-11-05 14:00:47 -08:00
Sergei Golubchik
570daecf49 cleanup: const in List::push_front() 2024-11-05 14:00:47 -08:00
Sergei Golubchik
44ff2f7831 reject invalid spatial key declarations in the parser 2024-11-05 14:00:47 -08:00
Sergei Golubchik
0cc01bde45 cleanup: pass TABLE_SHARE to store_key_options()
preparation for indexes that can be in TABLE_SHARE but not in TABLE
2024-11-05 14:00:47 -08:00
Sergei Golubchik
949fed514a cleanup: get_float convenience helper
more helpers like that can be added as needed
2024-11-05 14:00:47 -08:00
Sergei Golubchik
115d3e050c cleanup: engine_option_value::Value::find_in_list() helper 2024-11-05 14:00:47 -08:00
Sergei Golubchik
d046aca0c7 cleanup: CREATE_TYPELIB_FOR() helper 2024-11-05 14:00:47 -08:00
Sergei Golubchik
9fa31c1bd9 cleanup: spaces, casts, comments 2024-11-05 14:00:47 -08:00
Sergei Golubchik
9f2adffcca memroot improvement: fix savepoint support
savepoint support was added a year ago as get_last_memroot_block()
and free_all_new_blocks(), but it didn't work and was disabled.

* fix it to work
* instead of freeing the memory, only mark blocks free - this feature
  is supposed to be used inside a loop (otherwise there is no need
  to free anything, end of statement will do it anyway). And freeing
  blocks inside a loop is a bad idea, as they'll be all malloc-ed
  on the next iteration again. So, don't.
* fix a bug in mark_blocks_free() - it doesn't change the number of blocks
2024-11-05 14:00:47 -08:00
Sergei Golubchik
4f4c5a2ba9 fix a typo and an old bug in prefschema.transaction test 2024-11-05 14:00:47 -08:00
Sergei Golubchik
70f000f1dc fix main.plugin_vars test to cleanup after itself 2024-11-05 14:00:46 -08:00
Sergei Golubchik
9ddac64188 make INFORMATION_SCHEMA.STATISTICS.COMMENT not nullable
as it can never be null (only "" or "disabled")
2024-11-05 14:00:46 -08:00
Sergei Golubchik
680bdb76a6 fix for 32bit 2024-11-05 14:00:46 -08:00
Oleg Smirnov
a914087fab MDEV-35307 Unexpected error WARN_SORTING_ON_TRUNCATED_LENGTH or assertion failure in diagnostics area #2
When strict mode is enabled, all warnings during `INSERT` are
converted to errors regardless of their actual severity.
`WARN_SORTING_ON_TRUNCATED_LENGTH` is not considered severe enough
to be elevated to the ERROR level, and this commit fixes that
2024-11-05 14:52:20 +07:00
Alexander Barkov
a4cb03ec56 Adding a comment near the keyword NOCOPY 2024-11-05 09:14:06 +04:00
Alexander Barkov
5d4a4d2091 Fixing main.type_timestamp failure with --view
The patch for
  MDEV-35250 Assertion `dec <= 6' failed in my_timestamp_binary_length
added a test which depends on
  MDEV-29534 In view FROM_UNIXTIME adds .000000 in the result

Adding --disable_view_protocol around the affected statements.
2024-11-02 12:46:27 +04:00
Sergei Golubchik
ac7fe8b214 fix main.selectivity_notembedded --view 2024-11-01 21:01:31 +01:00
Sergei Golubchik
0a3452cf83 MDEV-35229 fix the test for --view
also, enable it for --ps
2024-11-01 20:52:58 +01:00
Sergei Golubchik
947de4b1db print more digits for floating point options in in mariadbd --help 2024-11-01 08:58:43 +01:00
Monty
40810baffe MDEV-33144 Implement the Percona variable slow_query_log_always_write_time
This task is inspired by the Percona implementation of
slow_query_log_always_write_time.

This task implements the variable log_slow_always_query_time (name
matching other MariaDB variables using the slow query log). The
default value for the variable is 31536000, which makes MariaDB
compatible with older installations.

For queries with execution time longer than log_slow_always_query_time
the variables log_slow_rate_limit and log_slow_min_examined_row_limit
will be ignored and the query will be written to the slow query log
if there is no other limitations (like log_slow_filter etc).

Other things:
- long_query_time internal variable renamed to log_slow_query_time.
- More descriptive information for "log_slow_query_time".
2024-11-01 08:58:37 +01:00
Oleg Smirnov
bf9662f6fa MDEV-35275 Unexpected WARN_SORTING_ON_TRUNCATED_LENGTH or assertion failure in diagnostics area
MDEV-27277 added warnings on truncation during sorting for SELECTs
but did not for DML operations. However, UPDATEs and DELETEs may also
perform sorting and thus produce warnings. This commit fixes that
2024-10-30 18:47:11 +07:00
Alexander Barkov
556a40dce0 MDEV-35229 NOCOPY has become reserved word bringing wide incompatibility
This patch was suggested by Sergei Golubchik.

It reverts the second patch from the PR:

  commit fa5eeb4931
    Fixed ALTER TABLE NOCOPY keyword failure

and adds NOCOPY_SYM into keyword_func_sp_var_and_label.

The price is one extra shift/recuce conflict in yy_oracle.yy.
This should to tolerable.
2024-10-30 13:58:20 +04:00
Alexander Barkov
8c0a260a5b MDEV-35250 Assertion `dec <= 6' failed in my_timestamp_binary_length
The TIMESTAMP related code did not handle AUTO_SEC_PART_DIGITS.
FROM_UNIXTIME() sets its member 'decimals' to AUTO_SEC_PART_DIGITS.
So some scripts involving FROM_UNIXTIME() crashed on assert in debug
builds and returned unexpected results in release builds.
2024-10-30 11:09:21 +04:00
Alexander Barkov
a79f314f1b MDEV-34817 perfschema.lowercase_fs_off fails on buildbot
This is a workaround patch to make buildbot green.

Renaming databases from db1/DB2 to m33020_db1/m33020_DB1
to make them unique. So the garbage left by other tests
does not show up any more.

The real problem will be fixed under terms of:
  MDEV-35282 Performance schema does not clear package routines
2024-10-30 10:21:29 +04:00
Aleksey Midenkov
cc183489da MDEV-27293 Allow converting a versioned table from implicit
to explicit row_start/row_end columns

In case of adding both system fields of same type (length, unsigned
flag) as old implicit system fields do the rename of implicit system
fields to the ones specified in ALTER, remove SYSTEM_INVISIBLE flag in
that case. Correct PERIOD clause must be specified in ALTER as well.

MDEV-34904 Inplace alter for implicit to explicit versioning is broken

Whether ALTER goes inplace and how it goes inplace depends on
handler_flags which goes from alter_info->flags by this logic:

  ha_alter_info->handler_flags|= (alter_info->flags & ~flags_to_remove);

ALTER_VERS_EXPLICIT was not in flags_to_remove and its value (1ULL <<
35) clashed with ALTER_ADD_NON_UNIQUE_NON_PRIM_INDEX.

ALTER_VERS_EXPLICIT must not affect inplace, it is SQL-only so we
remove it from handler_flags.
2024-10-29 17:46:40 +03:00
Sergei Golubchik
5e5c3c7cb6 post-merge changes
* remove duplicate test file
* move all uuidv7 tests into plugin/type_uuid/mysql-test/type_uuid/
* remove mysys/ changes
* auto my_random_bytes() fallback - removes duplicate code from uuid,
  and fixes all other users of my_random_bytes() that don't check
  the return value (because, perhaps, they don't need crypto-strong
  random bytes)
* End of 11.6 -> 11.7 in tests
* clarify the warning text
* UUID_VERSION_MASK()/UUID_VARIANT_MASK() must not depend on the version
* allow 4x more monotonic uuidv7 per millisecond - instead of stretching
  1000 microseconds over 12 bits, let's use extra 2 bits as a counter
* rename for compatibility with Percona Server (uuid_v4, uuid_v7)
2024-10-29 14:47:32 +01:00
StefanoPetrilli
2fe269fdcb MDEV-32637 Implement native UUID7 function 2024-10-29 14:47:32 +01:00
Daniel Black
ae69ac204a MDEV-32583 UUID() should be treated as stochastic for the purposes of forcing query materialization
Port 9e800eda86 changing lex->safe_to_cache_query
to lex->uncacheable(UNCACHEABLE_RAND).
2024-10-29 14:47:32 +01:00
Alexander Barkov
20611c8ae7 cleanup: MDEV-11339 Implement native UUID4 function
- Moving the class UUIDv1 into a separate file sql_type_uuid_v1.h

- Adding a new class UUIDv4, similar to UUIDv1

- Changing the way how my_random_bytes() failures are handled.
  Instead of raising an error it now raises a note.
  Reasoning: if we're in the middle of a multi-million row
  transaction and one UUIDv4 generation fails, it's not a good
  idea to throw away the entire transaction. Instead, let's
  generate bytes using a my_rnd() loop.

- Adding a new test func_uuid_v4.test to demonstrate that the UUIDv4()
  returned type is "UUID NOT NULL".

- Adding a new test func_uuidv4_debug.test to emulate my_random_bytes()
  failures

- Adding a template Item_func_uuid_vx to share the code
  between the implementations of UUID() and UUIDv4().
2024-10-29 14:47:32 +01:00
StefanoPetrilli
ef585df440 MDEV-11339 Implement native UUID4 function 2024-10-29 14:47:32 +01:00
Teemu Ollakka
47dd617c7f MDEV-35265 wsrep.wsrep-recover, wsrep.wsrep-recover-v25 fail on assertion
The tests fail on assertion

    ut_ad(!wsrep_is_wsrep_xid(&trx->xid));

in `innobase_recover_rollback_by_xid()`.

The fix is to avoid async rollback for prepared transactions
when wsrep is ON or wsrep recovery is in progress. The rationale
is that the rollback of prepared transactions must complete
before the node starts applying write sets after SST, or in
case of wsrep recovery, the recovery must complete before the
process exists.

Change the assertion into stronger one

    ut_ad(!(WSREP_ON || wsrep_recovery));

to catch if the async rollback codepath is taken when wsrep is
enabled.
2024-10-29 12:15:53 +02:00
Yuchen Pei
4b6922a315
MDEV-25008: UPDATE/DELETE: Cost-based choice IN->EXISTS vs Materialization
Single-table UPDATE/DELETE didn't provide outer_lookup_keys value for
subqueries. This didn't allow to make a meaningful choice between
IN->EXISTS and Materialization strategies for subqueries.

Fix this:
* Make UPDATE/DELETE save Sql_cmd_dml::scanned_rows,
* Then, subquery's JOIN::choose_subquery_plan() can fetch it from
there for outer_lookup_keys

Details:
UPDATE/DELETE now calls select_lex->optimize_unflattened_subqueries()
twice, like SELECT does (first call optimize_constant_subquries() in
JOIN::optimize_inner(), then call optimize_unflattened_subqueries() in
JOIN::optimize_stage2()):
1. Call with const_only=true before any optimizations. This allows
range optimizer and others to use the values of cheap const
subqueries.
2. Call it with const_only=false after range optimizer, partition
pruning, etc. outer_lookup_keys value is provided, so it's possible to
pick a good subquery strategy.

Note: PROTECT_STATEMENT_MEMROOT requires that first SP execution
performs subquery optimization for all subqueries, even for degenerate
query plans like "Impossible WHERE". Due to that, we ensure that the
call to optimize_unflattened_subqueries (with const_only=false) even
for degenerate query plans still happens, as was the case before this
change.
2024-10-23 23:51:24 +11:00
Oleg Smirnov
fd87e01f38 MDEV-27277 Add a warning when max_sort_length is reached
During a query execution some sorting and grouping operations
on strings may be involved. System variable max_sort_length defines
the maximum number of bytes to use when comparing strings during
sorting/grouping. Thus, the comparable parts of strings may be less
than their actual size, so the results of the query may be not
sorted/grouped properly.
To indicate that some comparisons were done on a truncated lengths,
a new warning has been introduced with this commit.
2024-10-22 22:39:36 +07:00
Alexander Barkov
0d17c540a5 MDEV-27277 Add a warning when max_sort_length is reached
Step#1: fixing the return type of strnxfrm() from size_t to this structure:

typedef struct
{
  size_t m_output_length;
  size_t m_source_length_used;
  uint m_warnings;
} my_strnxfrm_ret_t;
2024-10-22 21:42:53 +07:00
Alexander Barkov
e1cd3c4033 MDEV-12252 ROW data type for stored function return values
Adding support for the ROW data type in the stored function RETURNS clause:

- explicit ROW(..members...) for both sql_mode=DEFAULT and sql_mode=ORACLE

  CREATE FUNCTION f1() RETURNS ROW(a INT, b VARCHAR(32)) ...

- anchored "ROW TYPE OF [db1.]table1" declarations for sql_mode=DEFAULT

  CREATE FUNCTION f1() RETURNS ROW TYPE OF test.t1 ...

- anchored "[db1.]table1%ROWTYPE" declarations for sql_mode=ORACLE

  CREATE FUNCTION f1() RETURN test.t1%ROWTYPE ...

Adding support for anchored scalar data types in RETURNS clause:

- "TYPE OF [db1.]table1.column1" for sql_mode=DEFAULT

  CREATE FUNCTION f1() RETURNS TYPE OF test.t1.column1;

- "[db1.]table1.column1" for sql_mode=ORACLE

  CREATE FUNCTION f1() RETURN test.t1.column1%TYPE;

Details:

- Adding a new sql_mode_t parameter to
    sp_head::create()
    sp_head::sp_head()
    sp_package::create()
    sp_package::sp_package()
  to guarantee early initialization of sp_head::m_sql_mode.
  Before this change, this member was not initialized at all during
  CREATE FUNCTION/PROCEDURE/PACKAGE statements, and was not used.
  Now it needs to be initialized to write properly the
  mysql.proc.returns column, according to the create time sql_mode.

- Code refactoring to make the things simpler and functions smaller:

  * Adding a new method
    Field_row::row_create_fields(THD *thd, List<Spvar_definition> *list)
    to make a Virtual_tmp_table with Fields for ROW members
    from an explicit definition.

  * Adding a new method
    Field_row::row_create_fields(THD *thd, const Spvar_definition &def)
    to make a Virtual_tmp_table with Fields for ROW members
    from an explicit or a table anchored definition.

  * Adding a new method
    Item_args::add_array_of_item_field(THD *thd, const Virtual_tmp_table &vtable)
    to create and array of Item_field corresponding to all Field instances
    in a Virtual_tmp_table

  * Removing Item_field_row::row_create_items(). It was decomposed
    into the new methods described above.

  * Moving the code from the loop body in sp_rcontext::init_var_items()
    into a separate method Spvar_definition::make_item_field_row(),
    to make the code clearer (smaller functions).
    make_item_field_row() itself uses the new methods described above.

- Changing the data type of sp_head::m_return_field_def
  from Column_definition to Spvar_definition.
  So now it supports not only SQL column field types,
  but also explicit ROW and anchored ROW data types,
  as well as anchored column types.

- Adding a new Column_definition parameter to sp_head::create_result_field().
  Before this patch, create_result_field() took the definition only
  from m_return_field_def. Now it's also called with a local Column_definition
  variable which contains the explicit definition resolved from an
  anchored defition.

- Modifying sql_yacc.yy to support the new grammar.
  Adding new helper methods:
    * sf_return_fill_definition_row()
    * sf_return_fill_definition_rowtype_of()
    * sf_return_fill_definition_type_of()

- Fixing tests in:
  * Virtual_tmp_table::setup_field_pointers() in sql_select.cc
  * Send_field::normalize() in field.h
  * store_column_type()
  to prevent calling Type_handler_row::field_type(),
  which is implemented a DBUG_ASSERT(0).
  Before this patch the affected methods and functions were called only
  for scalar data types. Now ROW is also possible.

- Adding a new virtual method Field::cols()

- Overriding methods:
   Item_func_sp::cols()
   Item_func_sp::element_index()
   Item_func_sp::check_cols()
   Item_func_sp::bring_value()
  to support the ROW data type.

- Extending the rule sp_return_type to support
  * explicit ROW and anchored ROW data types
  * anchored scalar data types

- Overriding Field_row::sql_type() to print
  the data type of an explicit ROW.
2024-10-21 07:59:29 +04:00
Alexander Barkov
dfaf7e2eb4 MDEV-15751 CURRENT_TIMESTAMP should return a TIMESTAMP [WITH TIME ZONE?]
Changing the return type of the following functions:
  - CURRENT_TIMESTAMP, CURRENT_TIMESTAMP(), NOW()
  - SYSDATE()
  - FROM_UNIXTIME()
from DATETIME to TIMESTAMP.

Note, the old function NOW() returning DATETIME is still available
as LOCALTIMESTAMP or LOCALTIMESTAMP(), e.g.:

  SELECT
    LOCALTIMESTAMP,     -- DATETIME
    CURRENT_TIMESTAMP;  -- TIMESTAMP

The change in the functions return data type fixes some problems
that occurred near a DST change:

- Problem #1

INSERT INTO t1 (timestamp_field) VALUES (CURRENT_TIMESTAMP);
INSERT INTO t1 (timestamp_field) VALUES (COALESCE(CURRENT_TIMESTAMP));

could result into two different values inserted.

- Problem #2

INSERT INTO t1 (timestamp_field) VALUES (FROM_UNIXTIME(1288477526));
INSERT INTO t1 (timestamp_field) VALUES (FROM_UNIXTIME(1288477526+3600));

could result into two equal TIMESTAMP values near a DST change.

Additional changes:

- FROM_UNIXTIME(0) now returns SQL NULL instead of '1970-01-01 00:00:00'
  (assuming time_zone='+00:00')

- UNIX_TIMESTAMP('1970-01-01 00:00:00') now returns SQL NULL instead of 0
  (assuming time_zone='+00:00'

These additional changes are needed for consistency with TIMESTAMP fields,
which cannot store '1970-01-01 00:00:00 +00:00'
2024-10-19 22:48:23 +02:00