Commit graph

202447 commits

Author SHA1 Message Date
Sergei Golubchik
cbc2812f80 MDEV-35287 ER_KEY_NOT_FOUND upon INSERT into InnoDB table with vector key under READ COMMITTED
InnoDB cannot enable internal bulk insert for hlindex tables
2024-11-05 14:00:52 -08:00
Sergei Golubchik
ad33ffc0b5 MDEV-35296 DESC does not work in ORDER BY with vector key
only user vector indexes for ORDER BY ... ASC
2024-11-05 14:00:52 -08:00
Sergei Golubchik
7feec30939 relax the XA recovery error
it's just a suggestion anyway, not a bullet-proof check,
let's not act as if it is
2024-11-05 14:00:52 -08:00
Sergei Golubchik
b09c8b03d7 MDEV-35244 Vector-related system variables could use better names
considering that users don't interact with MariaDB vector search directly,
but primarily use AI frameworks, we should use names familiar
to vector store connector writers and for AI framework users.
That is industry standard M and ef.

mhnsw_cache_size -> mhnsw_max_cache_size
mhnsw_distance_function -> mhnsw_default_distance
mhnsw_max_edges_per_node -> mhnsw_default_m
mhnsw_min_limit -> mhnsw_ef_search

inside CREATE TABLE:
max_edges_per_node -> m
distance_function -> distance
2024-11-05 14:00:52 -08:00
Sergei Golubchik
784becf3e1 MDEV-35267 Server crashes in _ma_reset_history upon altering on Aria table with vector key under lock
ALTER TABLE needs to open hlindex tables early enough, right after they
were created, so that cleanup after an error would see and delete them.

But they need to be external_lock-ed only in copy_data_between_tables,
after mysql_trans_prepare_alter_copy_data().

Let's move locking out of hlindex_open() into hlindex_lock()
2024-11-05 14:00:52 -08:00
Sergei Golubchik
5d9ebef41e MDEV-35258 Mariabackup does not work with MyISAM tables with vector keys
recognize *#i#* files in mariadb-backup
2024-11-05 14:00:52 -08:00
Sergei Golubchik
0b9bc6c3cd MDEV-35246 Vector search skips a row in the table
stronger condition in select_neighbors() to reject exact matches too
2024-11-05 14:00:52 -08:00
Sergey Vojtovich
d50663198c DDL recovery for high-level indexes 2024-11-05 14:00:52 -08:00
Sergey Vojtovich
883fb66cd4 MDEV-35130 Assertion fails in trx_t::check_bulk_buffer upon CREATE.. SELECT with vector key
Similarly to "ALTER TABLE fixes for high-level indexes", don't enable bulk
insert when issuing create ... insert into a table containing vector
index. InnoDB can't handle situation when bulk insert is enabled for
one table but disabled for another. We can't do bulk insert on vector
index as it does table updates currently.
2024-11-05 14:00:52 -08:00
Sergei Golubchik
f6de9a379a MDEV-34919 post-fix
* add Aria truncate checks
* do store_lock() with a correct TL_xxx level
* remove InnoDB workaround for missing store_lock (from MDEV-35032)
* don't start transaction in temp tables (for Aria, with a test case)
2024-11-05 14:00:52 -08:00
Sergey Vojtovich
1cc7ef52e3 MDEV-34919 Aria crashes with high-level (vector) indexes
Since high-level index tables do not participate in thr_multi_lock(), added
explicit call to THR_LOCK::start_trans(). This is needed mostly for Aria to
handle transaction logging.
2024-11-05 14:00:52 -08:00
Sergei Golubchik
72839c1435 MDEV-35245 SHOW CREATE TABLE produces unusable statement for vector fields with constant default value
print default values for binary types as binary strings
2024-11-05 14:00:52 -08:00
Sergei Golubchik
053bd80d43 MDEV-35230 ASAN errors upon reading from joined temptable views with vector type
fix Field_vector::get_copy_func() for the case when length_bytes differ

fix do_copy_vec() to not guess length_bytes but take it from the field
(for keys length_bytes is always 2 for any length)
2024-11-05 14:00:52 -08:00
Sergei Golubchik
7d081c1b83 MDEV-35223 REPAIR does not fix MyISAM table with vector key after crash recovery
resort to alter for repair too
2024-11-05 14:00:52 -08:00
Sergei Golubchik
e8cff8e829 MDEV-35219 Unexpected ER_DUP_KEY after OPTIMIZE on MyISAM table with vector key
in-engine optimize can break hlindexes. let's fallback to ALTER
2024-11-05 14:00:52 -08:00
Sergei Golubchik
8988decbfe MDEV-35220 Assertion `!item->null_value' failed upon VEC_TOTEXT call
don't forget to reset null_value for each row
2024-11-05 14:00:52 -08:00
Sergei Golubchik
14364b09b9 MDEV-35236 Assertion `(mem_root->flags & 4) == 0' failed in safe_lexcstrdup_root
followup for MDEV-35092
2024-11-05 14:00:52 -08:00
Sergei Golubchik
1a53048299 MDEV-35215 ASAN errors in Item_func_vec_fromtext::val_str upon VEC_FROMTEXT with an invalid argument 2024-11-05 14:00:52 -08:00
Sergei Golubchik
96eb66e5b3 MDEV-35205 Server crash in online alter upon concurrent ALTER and DML on table with vector field
test case
2024-11-05 14:00:52 -08:00
Sergei Golubchik
e020a3a2ce MDEV-35210 Vector type cannot store values which VEC_FromText produces and VEC_ToText accepts
let VEC_FromText validate that the vector l2squared isn't NaN.
VEC_ToText still prints everything.
2024-11-05 14:00:52 -08:00
Sergei Golubchik
f336b10bb1 MDEV-35212 Server crashes in Item_func_vec_fromtext::val_str upon query from empty table 2024-11-05 14:00:52 -08:00
Sergei Golubchik
2bec721316 MDEV-35203 ASAN errors or assertion failures in row_sel_convert_mysql_key_to_innobase upon query from table with usual key on vector field
add test
2024-11-05 14:00:52 -08:00
Sergei Golubchik
2e74a00d9d MDEV-35195 Assertion `tab->join->order' fails upon vector search with DISTINCT #2
MDEV-35337 Server crash or assertion failure in join_read_first upon using vector distance in group by

allow Item_func_distance to be not only in tab->join->order,
but alternatively in tab->join->group_list
2024-11-05 14:00:52 -08:00
Sergei Golubchik
926b339b93 MDEV-35194 non-BNL join fails on assertion
with streaming implemened mhnsw no longer needs to know
the LIMIT in advance. let's just cap it to avoid allocating
too much memory for the one step result set
2024-11-05 14:00:52 -08:00
Sergei Golubchik
597e34d000 MDEV-35213 Server crash or assertion failure upon query with high value of mhnsw_min_limit
mhnsw_min_limit must not be larger than candidates queue size
2024-11-05 14:00:52 -08:00
Sergei Golubchik
dd9a5dd5b5 MDEV-35204 mysqlbinlog --verbose fails on row events with vector type
test case
2024-11-05 14:00:52 -08:00
Sergei Golubchik
ed9fec0266 MDEV-35177 Unexpected ER_TRUNCATED_WRONG_VALUE_FOR_FIELD, diagnostics area assertion failures upon EITS collection with vector type 2024-11-05 14:00:52 -08:00
Sergei Golubchik
db10e5cf6c MDEV-35160 RBR does not work with vector type, ER_SLAVE_CONVERSION_FAILED 2024-11-05 14:00:52 -08:00
Sergei Golubchik
8f49fb8cc3 MDEV-35191 Assertion failure in Create_tmp_table::finalize upon DISTINCT with vector type
test only
2024-11-05 14:00:51 -08:00
Sergei Golubchik
cfbf065893 MDEV-35176 ASAN errors in Field_vector::store with optimizer_trace enabled 2024-11-05 14:00:51 -08:00
Sergei Golubchik
425aa95655 MDEV-35178 Assertion failure in Field_vector::store upon INSERT IGNORE with a wrong data 2024-11-05 14:00:51 -08:00
Sergei Golubchik
88adcbf35a MDEV-35182 crash in online_alter_end_trans with XA over vector indexes
ONLINE ALTER didn't expect XA PREPARE to fail.
Mark rollback on failed prepare with the XA_ROLLBACK_ONLY state,
detect that in ONLINE ALTER
2024-11-05 14:00:51 -08:00
Sergei Golubchik
5bde23990b MDEV-35159 Assertion `tab->join->select_limit < (~ (ha_rows) 0)' fails upon forcing vector key
init_from_binary_frm_image() wrongly assumed that
* if a table has primary key
* and it has the HA_PRIMARY_KEY_IN_READ_INDEX flag
* than ORDER BY any index automatically implies ORDER BY pk at the end,
   that is for an index (a,b,c) ORDER BY a,b,c means ORDER BY a,b,c,pk

which is wrong, it holds not for _any index_ but only for indexes
that can be used for ORDER BY.

So, don't do `field->part_of_sortkey= share->keys_in_use`
but introduce `sort_keys_in_use` and use that.
2024-11-05 14:00:51 -08:00
Sergei Golubchik
a83afd8537 cleanup: remove String::append_float
between set_real() and set_fcvt(), a third String method for
real->string conversion looks definitely redundant
2024-11-05 14:00:51 -08:00
Sergei Golubchik
88119addff Vec_ToText was underestimating max_length of the result
switch to a more predictable, shorter, and more correct output
that is, print as many significant digits as necessary.
but not more (they'd be just zeros) and not less (it'd lose precision)
2024-11-05 14:00:51 -08:00
Sergei Golubchik
91720da9be MDEV-35158 Assertion `res->length() > 0 && res->length() % 4 == 0' fails upon increasing length of vector column 2024-11-05 14:00:51 -08:00
Sergei Golubchik
6634c88480 MDEV-35150 Column containing non-vector tables can be modified to VECTOR type without warnings 2024-11-05 14:00:51 -08:00
Sergei Golubchik
ca28761066 MDEV-35147 Inconsistent NULL handling in vector type 2024-11-05 14:00:51 -08:00
Sergei Golubchik
f274cf1c25 MDEV-35141 Server crashes in Field_vector::report_wrong_value upon statistic collection 2024-11-05 14:00:51 -08:00
Sergei Golubchik
78119d1ae5 MDEV-33410 VECTOR data type 2024-11-05 14:00:51 -08:00
Sergei Golubchik
b56ca29f89 MDEV-35105 Assertion `tab->join->order' fails upon vector search with DISTINCT
don't apply distinct optimization to order by a vector index
2024-11-05 14:00:51 -08:00
Sergei Golubchik
9ddb94f60e MDEV-35104 Invalid (old?) table or database name upon DDL on table with vector key and unique key
InnoDB rename needs the same workaround for hlindexes
as it has for partitions
2024-11-05 14:00:51 -08:00
Sergei Golubchik
7d9c0e4f62 MDEV-35092 Server crash, hang or ASAN errors in mysql_create_frm_image upon using non-default table options and system variables
extend the option_list expicitly on CREATE/ALTER, not implicitly
on parsing.
2024-11-05 14:00:51 -08:00
Sergei Golubchik
cdc7253787 make MyISAM and Aria report correct reflength to the server
MyISAM and Aria used to lie to the server about the reflength value.
One value was used internally, it was stored on disk, e.g. in indexes,
and couldn't be changed without full table rebuild. A differently
calculated value was reported to the server - that value was sometimes
larger than the true reflength.

That caused the server to allocate more memory per position than
necessary - affecting filesort, join buffer usage, optimizer cost
calculations, and may be more.
2024-11-05 14:00:51 -08:00
Sergei Golubchik
ea1e720391 MDEV-35078 Server crash or ASAN errors in mhnsw_insert
when adding a column or index that uses plugin-defined
sysvar-based options with ALTER TABLE ... ADD, the server
was using the default value of the sysvar, not the current one.

CREATE TABLE was correctly using the current sysvar value.

Fix it so that new columns/indexes added in ALTER TABLE ... ADD
would use a current sysvar value. Existing columns/indexes
in ALTER TABLE would keep using the default sysvar value
(unless they had an explicit value in frm).
2024-11-05 14:00:51 -08:00
Sergei Golubchik
855aefb7b5 mysqldump and mariadb-backup tests of vector indexes 2024-11-05 14:00:51 -08:00
Sergei Golubchik
eb4ab2ce8f MDEV-35061 XA PREPARE "not supported by the engine" from storage engine mhnsw, memory leak
disallow explicit XA PREPARE over mhnsw indexes
2024-11-05 14:00:51 -08:00
Sergei Golubchik
09cd817f5d MDEV-35060 Assertion failure upon DML on table with vector under lock 2024-11-05 14:00:51 -08:00
Sergei Golubchik
09889d417b MDEV-35055 ASAN errors in TABLE_SHARE::lock_share upon committing transaction after FLUSH on table with vector key
MHNSW_Trx cannot store a pointer to the TABLE_SHARE for the duration
of a transaction, because the share can be evicted from the cache any
time.

Use a pointer to the MDL_ticket instead, it is stored in the THD
and has a duration of MDL_TRANSACTION, so won't go away.

When we need a TABLE_SHARE - on commit - get a new one from tdc.
Normally, it'll be already in the cache, so it'd be fast.
We don't optimize for the edge case when TABLE_SHARE was evicted.
2024-11-05 14:00:51 -08:00
Sergei Golubchik
d396fb9226 MDEV-35021 Behavior for RTREE indexes changed, assertion fails
disallow USING RTREE for not SPATIAL index
2024-11-05 14:00:51 -08:00