mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-15 19:42:28 +01:00

Author	SHA1	Message	Date
Sergei Golubchik	7aa28a2a54	MDEV-35354 InnoDB: Failing assertion: node->pcur->rel_pos == BTR_PCUR_ON upon LOAD DATA REPLACE with unique blob restore erroneously changed line followup for `f2512c0fa8`	2024-11-07 06:16:03 -08:00
Sergei Golubchik	f24ebbaa5c	cleanup: main.loaddata_autocom_innodb	2024-11-07 06:12:54 -08:00
Alexander Barkov	b9f9d804f2	MDEV-28686 Assertion `0' in Type_handler_string_result::make_sort_key or unexpected result The code in the can_eval_in_optimize() branch in Item_func_pad::fix_length_and_dec() did not take into account that the constant can be negative. So the function will return NULL. This later crashed on DBUG_ASSERT() because a NOT NULL function returned NULL. Adding set_maybe_null() into this branch if the constant is negative.	2024-11-06 15:45:59 +04:00
Alexander Barkov	4ded2cbe13	MDEV-31910 ASAN memcpy-param-overlap upon CONCAT in ORACLE mode Fixing Item_func_concat_operator_oracle::val_str() to use String::copy_or_move(), like Item_func_oracle::val_str() does.	2024-11-06 11:39:50 +04:00
Julius Goryavsky	db68eb69f9	MDEV-35344: post-fix correction for other galera tests	2024-11-06 04:59:10 +01:00
Jan Lindström	e4a3a11dcc	MDEV-35344 : Galera test failure on galera_sync_wait_upto Ignoring configured server_id should not be a warning because correct configuration is documented. Changed message to info level with more detailed message what was configured and what will be actually used. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-11-06 04:59:10 +01:00
Jan Lindström	eb891b6a95	MDEV-35345 : Galera test failure on MW-402 Add missing have_debug[_sync].inc include. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-11-06 04:59:09 +01:00
Denis Protivensky	6d5fe9ed0d	MDEV-28378: Don't hang trying to peek log event past the end of log While applying CTAS log event, we peek the relay log to see if CTAS contains inserted rows or if it's empty. The peek function didn't check for end-of-file condition when tried to get the next event from the log, and thus it hanged. The fix includes checking for end-of-file while peeking for log events and considering returned XID_EVENT value as a sign of an empty CTAS. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-11-06 04:59:09 +01:00
Sergei Golubchik	ebbbe9d960	MDEV-35319 ER_LOCK_DEADLOCK not detected upon DML on table with vector key, server crashes cannot ignore the error in MHNSW_Share::acquire() - it could be a deadlock signal, after which no further operations are allowed	2024-11-05 14:00:52 -08:00
Sergei Golubchik	574e18f80d	MDEV-35308 NO_KEY_OPTIONS SQL mode has no effect on engine key options hide INVISIBLE and engine field options under sql_mode=no_field_options hide PARSER and engine key options under sql_mode=no_key_options	2024-11-05 14:00:52 -08:00
Sergei Golubchik	e5a5d2b78d	MDEV-35214 Server crashes in FVectorNode::gref_len with insufficient mhnsw_max_cache_size now with streaming (MDEV-35032) we cannot longer free MHNSW_Trx at the end of the search. Cannot even free it at the end of the mhnsw_insert, because there can be a search running (INSERT ... SELECT). Let's do reference counting, even though it's a thread-local object.	2024-11-05 14:00:52 -08:00
Sergei Golubchik	cbc2812f80	MDEV-35287 ER_KEY_NOT_FOUND upon INSERT into InnoDB table with vector key under READ COMMITTED InnoDB cannot enable internal bulk insert for hlindex tables	2024-11-05 14:00:52 -08:00
Sergei Golubchik	ad33ffc0b5	MDEV-35296 DESC does not work in ORDER BY with vector key only user vector indexes for ORDER BY ... ASC	2024-11-05 14:00:52 -08:00
Sergei Golubchik	7feec30939	relax the XA recovery error it's just a suggestion anyway, not a bullet-proof check, let's not act as if it is	2024-11-05 14:00:52 -08:00
Sergei Golubchik	b09c8b03d7	MDEV-35244 Vector-related system variables could use better names considering that users don't interact with MariaDB vector search directly, but primarily use AI frameworks, we should use names familiar to vector store connector writers and for AI framework users. That is industry standard M and ef. mhnsw_cache_size -> mhnsw_max_cache_size mhnsw_distance_function -> mhnsw_default_distance mhnsw_max_edges_per_node -> mhnsw_default_m mhnsw_min_limit -> mhnsw_ef_search inside CREATE TABLE: max_edges_per_node -> m distance_function -> distance	2024-11-05 14:00:52 -08:00
Sergei Golubchik	784becf3e1	MDEV-35267 Server crashes in _ma_reset_history upon altering on Aria table with vector key under lock ALTER TABLE needs to open hlindex tables early enough, right after they were created, so that cleanup after an error would see and delete them. But they need to be external_lock-ed only in copy_data_between_tables, after mysql_trans_prepare_alter_copy_data(). Let's move locking out of hlindex_open() into hlindex_lock()	2024-11-05 14:00:52 -08:00
Sergei Golubchik	5d9ebef41e	MDEV-35258 Mariabackup does not work with MyISAM tables with vector keys recognize #i# files in mariadb-backup	2024-11-05 14:00:52 -08:00
Sergei Golubchik	0b9bc6c3cd	MDEV-35246 Vector search skips a row in the table stronger condition in select_neighbors() to reject exact matches too	2024-11-05 14:00:52 -08:00
Sergey Vojtovich	d50663198c	DDL recovery for high-level indexes	2024-11-05 14:00:52 -08:00
Sergey Vojtovich	883fb66cd4	MDEV-35130 Assertion fails in trx_t::check_bulk_buffer upon CREATE.. SELECT with vector key Similarly to "ALTER TABLE fixes for high-level indexes", don't enable bulk insert when issuing create ... insert into a table containing vector index. InnoDB can't handle situation when bulk insert is enabled for one table but disabled for another. We can't do bulk insert on vector index as it does table updates currently.	2024-11-05 14:00:52 -08:00
Sergei Golubchik	f6de9a379a	MDEV-34919 post-fix * add Aria truncate checks * do store_lock() with a correct TL_xxx level * remove InnoDB workaround for missing store_lock (from MDEV-35032) * don't start transaction in temp tables (for Aria, with a test case)	2024-11-05 14:00:52 -08:00
Sergey Vojtovich	1cc7ef52e3	MDEV-34919 Aria crashes with high-level (vector) indexes Since high-level index tables do not participate in thr_multi_lock(), added explicit call to THR_LOCK::start_trans(). This is needed mostly for Aria to handle transaction logging.	2024-11-05 14:00:52 -08:00
Sergei Golubchik	72839c1435	MDEV-35245 SHOW CREATE TABLE produces unusable statement for vector fields with constant default value print default values for binary types as binary strings	2024-11-05 14:00:52 -08:00
Sergei Golubchik	053bd80d43	MDEV-35230 ASAN errors upon reading from joined temptable views with vector type fix Field_vector::get_copy_func() for the case when length_bytes differ fix do_copy_vec() to not guess length_bytes but take it from the field (for keys length_bytes is always 2 for any length)	2024-11-05 14:00:52 -08:00
Sergei Golubchik	7d081c1b83	MDEV-35223 REPAIR does not fix MyISAM table with vector key after crash recovery resort to alter for repair too	2024-11-05 14:00:52 -08:00
Sergei Golubchik	e8cff8e829	MDEV-35219 Unexpected ER_DUP_KEY after OPTIMIZE on MyISAM table with vector key in-engine optimize can break hlindexes. let's fallback to ALTER	2024-11-05 14:00:52 -08:00
Sergei Golubchik	8988decbfe	MDEV-35220 Assertion `!item->null_value' failed upon VEC_TOTEXT call don't forget to reset null_value for each row	2024-11-05 14:00:52 -08:00
Sergei Golubchik	14364b09b9	MDEV-35236 Assertion `(mem_root->flags & 4) == 0' failed in safe_lexcstrdup_root followup for MDEV-35092	2024-11-05 14:00:52 -08:00
Sergei Golubchik	1a53048299	MDEV-35215 ASAN errors in Item_func_vec_fromtext::val_str upon VEC_FROMTEXT with an invalid argument	2024-11-05 14:00:52 -08:00
Sergei Golubchik	96eb66e5b3	MDEV-35205 Server crash in online alter upon concurrent ALTER and DML on table with vector field test case	2024-11-05 14:00:52 -08:00
Sergei Golubchik	e020a3a2ce	MDEV-35210 Vector type cannot store values which VEC_FromText produces and VEC_ToText accepts let VEC_FromText validate that the vector l2squared isn't NaN. VEC_ToText still prints everything.	2024-11-05 14:00:52 -08:00
Sergei Golubchik	f336b10bb1	MDEV-35212 Server crashes in Item_func_vec_fromtext::val_str upon query from empty table	2024-11-05 14:00:52 -08:00
Sergei Golubchik	2bec721316	MDEV-35203 ASAN errors or assertion failures in row_sel_convert_mysql_key_to_innobase upon query from table with usual key on vector field add test	2024-11-05 14:00:52 -08:00
Sergei Golubchik	2e74a00d9d	MDEV-35195 Assertion `tab->join->order' fails upon vector search with DISTINCT #2 MDEV-35337 Server crash or assertion failure in join_read_first upon using vector distance in group by allow Item_func_distance to be not only in tab->join->order, but alternatively in tab->join->group_list	2024-11-05 14:00:52 -08:00
Sergei Golubchik	926b339b93	MDEV-35194 non-BNL join fails on assertion with streaming implemened mhnsw no longer needs to know the LIMIT in advance. let's just cap it to avoid allocating too much memory for the one step result set	2024-11-05 14:00:52 -08:00
Sergei Golubchik	597e34d000	MDEV-35213 Server crash or assertion failure upon query with high value of mhnsw_min_limit mhnsw_min_limit must not be larger than candidates queue size	2024-11-05 14:00:52 -08:00
Sergei Golubchik	dd9a5dd5b5	MDEV-35204 mysqlbinlog --verbose fails on row events with vector type test case	2024-11-05 14:00:52 -08:00
Sergei Golubchik	ed9fec0266	MDEV-35177 Unexpected ER_TRUNCATED_WRONG_VALUE_FOR_FIELD, diagnostics area assertion failures upon EITS collection with vector type	2024-11-05 14:00:52 -08:00
Sergei Golubchik	db10e5cf6c	MDEV-35160 RBR does not work with vector type, ER_SLAVE_CONVERSION_FAILED	2024-11-05 14:00:52 -08:00
Sergei Golubchik	8f49fb8cc3	MDEV-35191 Assertion failure in Create_tmp_table::finalize upon DISTINCT with vector type test only	2024-11-05 14:00:51 -08:00
Sergei Golubchik	cfbf065893	MDEV-35176 ASAN errors in Field_vector::store with optimizer_trace enabled	2024-11-05 14:00:51 -08:00
Sergei Golubchik	425aa95655	MDEV-35178 Assertion failure in Field_vector::store upon INSERT IGNORE with a wrong data	2024-11-05 14:00:51 -08:00
Sergei Golubchik	88adcbf35a	MDEV-35182 crash in online_alter_end_trans with XA over vector indexes ONLINE ALTER didn't expect XA PREPARE to fail. Mark rollback on failed prepare with the XA_ROLLBACK_ONLY state, detect that in ONLINE ALTER	2024-11-05 14:00:51 -08:00
Sergei Golubchik	5bde23990b	MDEV-35159 Assertion `tab->join->select_limit < (~ (ha_rows) 0)' fails upon forcing vector key init_from_binary_frm_image() wrongly assumed that * if a table has primary key * and it has the HA_PRIMARY_KEY_IN_READ_INDEX flag * than ORDER BY any index automatically implies ORDER BY pk at the end, that is for an index (a,b,c) ORDER BY a,b,c means ORDER BY a,b,c,pk which is wrong, it holds not for _any index_ but only for indexes that can be used for ORDER BY. So, don't do `field->part_of_sortkey= share->keys_in_use` but introduce `sort_keys_in_use` and use that.	2024-11-05 14:00:51 -08:00
Sergei Golubchik	88119addff	Vec_ToText was underestimating max_length of the result switch to a more predictable, shorter, and more correct output that is, print as many significant digits as necessary. but not more (they'd be just zeros) and not less (it'd lose precision)	2024-11-05 14:00:51 -08:00
Sergei Golubchik	91720da9be	MDEV-35158 Assertion `res->length() > 0 && res->length() % 4 == 0' fails upon increasing length of vector column	2024-11-05 14:00:51 -08:00
Sergei Golubchik	6634c88480	MDEV-35150 Column containing non-vector tables can be modified to VECTOR type without warnings	2024-11-05 14:00:51 -08:00
Sergei Golubchik	ca28761066	MDEV-35147 Inconsistent NULL handling in vector type	2024-11-05 14:00:51 -08:00
Sergei Golubchik	f274cf1c25	MDEV-35141 Server crashes in Field_vector::report_wrong_value upon statistic collection	2024-11-05 14:00:51 -08:00
Sergei Golubchik	78119d1ae5	MDEV-33410 VECTOR data type	2024-11-05 14:00:51 -08:00
Sergei Golubchik	b56ca29f89	MDEV-35105 Assertion `tab->join->order' fails upon vector search with DISTINCT don't apply distinct optimization to order by a vector index	2024-11-05 14:00:51 -08:00
Sergei Golubchik	9ddb94f60e	MDEV-35104 Invalid (old?) table or database name upon DDL on table with vector key and unique key InnoDB rename needs the same workaround for hlindexes as it has for partitions	2024-11-05 14:00:51 -08:00
Sergei Golubchik	7d9c0e4f62	MDEV-35092 Server crash, hang or ASAN errors in mysql_create_frm_image upon using non-default table options and system variables extend the option_list expicitly on CREATE/ALTER, not implicitly on parsing.	2024-11-05 14:00:51 -08:00
Sergei Golubchik	cdc7253787	make MyISAM and Aria report correct reflength to the server MyISAM and Aria used to lie to the server about the reflength value. One value was used internally, it was stored on disk, e.g. in indexes, and couldn't be changed without full table rebuild. A differently calculated value was reported to the server - that value was sometimes larger than the true reflength. That caused the server to allocate more memory per position than necessary - affecting filesort, join buffer usage, optimizer cost calculations, and may be more.	2024-11-05 14:00:51 -08:00
Sergei Golubchik	ea1e720391	MDEV-35078 Server crash or ASAN errors in mhnsw_insert when adding a column or index that uses plugin-defined sysvar-based options with ALTER TABLE ... ADD, the server was using the default value of the sysvar, not the current one. CREATE TABLE was correctly using the current sysvar value. Fix it so that new columns/indexes added in ALTER TABLE ... ADD would use a current sysvar value. Existing columns/indexes in ALTER TABLE would keep using the default sysvar value (unless they had an explicit value in frm).	2024-11-05 14:00:51 -08:00
Sergei Golubchik	855aefb7b5	mysqldump and mariadb-backup tests of vector indexes	2024-11-05 14:00:51 -08:00
Sergei Golubchik	eb4ab2ce8f	MDEV-35061 XA PREPARE "not supported by the engine" from storage engine mhnsw, memory leak disallow explicit XA PREPARE over mhnsw indexes	2024-11-05 14:00:51 -08:00
Sergei Golubchik	09cd817f5d	MDEV-35060 Assertion failure upon DML on table with vector under lock	2024-11-05 14:00:51 -08:00
Sergei Golubchik	09889d417b	MDEV-35055 ASAN errors in TABLE_SHARE::lock_share upon committing transaction after FLUSH on table with vector key MHNSW_Trx cannot store a pointer to the TABLE_SHARE for the duration of a transaction, because the share can be evicted from the cache any time. Use a pointer to the MDL_ticket instead, it is stored in the THD and has a duration of MDL_TRANSACTION, so won't go away. When we need a TABLE_SHARE - on commit - get a new one from tdc. Normally, it'll be already in the cache, so it'd be fast. We don't optimize for the edge case when TABLE_SHARE was evicted.	2024-11-05 14:00:51 -08:00
Sergei Golubchik	d396fb9226	MDEV-35021 Behavior for RTREE indexes changed, assertion fails disallow USING RTREE for not SPATIAL index	2024-11-05 14:00:51 -08:00
Sergei Golubchik	b3afd9f640	MDEV-35042 Vector indexes are allowed for MERGE tables, but do not disallow hlindexes in MERGE - because we cannot create the secondary table in the same engine	2024-11-05 14:00:51 -08:00
Sergei Golubchik	0932c3a27e	MDEV-35044 ALTER on a table with vector index attempts to bypass unsupported locking limitation, server crashes in THD::free_tmp_table_share open secondary tables early enough for the cleanup on error to see them and remove their underlying files	2024-11-05 14:00:51 -08:00
Sergei Golubchik	824a63852b	MDEV-35043 Unsuitable error upon an attempt to create MEMORY table with vector key MEMORY engine doesn't support blobs	2024-11-05 14:00:51 -08:00
Sergei Golubchik	9f80e3fbb7	MDEV-35032 streaming mode for mhnsw search support SQL semantics for SELECT ... WHERE ... ORDER BY ... LIMIT * switch from returning k nearest neighbors to returning as many as needed, in k-neighbor chunks, with increasing distance * make search_layer() skips nodes that are closer than a threshold * read_next keeps a search context - list of k found nodes, threshold, ctx, etc. * when the list of found nodes is exhausted, it repeats the search starting from last found nodes and a threshold * search context kepts ctx->refcount incremented, so ctx won't go away * but commit_lock is unlocked between calls, so InnoDB can modify the table * use ctx version to detect that, switch to MHNSW_Trx when it happens bugfix: * use the correct lock in ha_external_lock() for the graph table * InnoDB didn't reset locks on ha_external_lock(F_UNLCK) and previous LOCK_X leaked into the next statement	2024-11-05 14:00:51 -08:00
Sergei Golubchik	be69716287	MDEV-35029 ASAN errors in Lex_ident<Compare_ident_ci>::is_valid_ident upon DDL on table with vector index in ALTER TABLE or CREATE TABLE LIKE, a create a copy of key->option_list, because it can be extended later on the thd->mem_root, so it has to be a copy	2024-11-05 14:00:51 -08:00
Sergei Golubchik	a6499049af	MDEV-35033 LeakSanitizer errors in my_malloc / safe_mutex_lazy_init_deadlock_detection / MHNSW_Context::alloc_node and alike ctx wasn't released on errors	2024-11-05 14:00:51 -08:00
Sergei Golubchik	6837bb54f4	MDEV-35020 After a failed attempt to create vector index temporary file remains and prevents further operation	2024-11-05 14:00:50 -08:00
Sergei Golubchik	ec2ff9f2a0	MDEV-35035 Assertion failure in ha_blackhole::position upon INSERT into blackhole table with vector index let's allow ::position() and ::rnd_pos() in blackhole. ::position() can be called directly after insert, it doesn't need a search to happen, so it's possible. ::rnd_pos() can be called with a value that ::position() produced, so, possible too.	2024-11-05 14:00:50 -08:00
Sergei Golubchik	8ac3f0b1d4	MDEV-35038 Server crash in Index_statistics::get_avg_frequency upon EITS collection for vector index don't collect eits for vector indexes	2024-11-05 14:00:50 -08:00
Sergei Golubchik	0bd01f4a95	MDEV-35039 Number of indexes inside InnoDB differs from that defined in MariaDB after altering table with vector key don't show table->s->total_keys to engine in inplace alter	2024-11-05 14:00:50 -08:00
Sergei Golubchik	8253650aaa	MDEV-35006 Using varbinary as vector-storing column results in assertion failures * hlindexes cannot be extended with pk * hlindexes cannot be covering	2024-11-05 14:00:50 -08:00
Sergei Golubchik	fb04cad37e	trying to stabilize floating-point tests	2024-11-05 14:00:50 -08:00
Sergey Vojtovich	f867c2a21e	Disabled high-level indexes with Aria ... until a few bugs that cause server crash are fixed.	2024-11-05 14:00:50 -08:00
Sergey Vojtovich	ca17b68bb6	ALTER TABLE fixes for high-level indexes (iii) quick_rm_table() expects .frm to exist when it removes high-level indexes. For cases like ALTER TABLE t1 RENAME TO t2, ENGINE=other_engine .frm was removed earlier. Another option would be removing high-level indexes explicitly before the first quick_rm_table() and skipping high-level indexes for subsequent quick_rm_table(NO_FRM_RENAME). But this suggested order may also help with ddl log recovery. That is if we crash before high-level indexes are removed, .frm is going to exist.	2024-11-05 14:00:50 -08:00
Sergey Vojtovich	7aa6bb3aa3	ALTER TABLE fixes for high-level indexes (ii) Disable non-copy ALTER algorithms when VECTOR index is affected. Engines are not supposed to handle high-level indexes anyway. Also fixed misbehaving IF [NOT] EXISTS variants.	2024-11-05 14:00:50 -08:00
Sergey Vojtovich	a90fa3f397	ALTER TABLE fixes for high-level indexes (i) Fixes for ALTER TABLE ... ADD/DROP COLUMN, ALGORITHM=COPY. Let quick_rm_table() remove high-level indexes along with original table. Avoid locking uninitialized LOCK_share for INTERNAL_TMP_TABLEs. Don't enable bulk insert when altering a table containing vector index. InnoDB can't handle situation when bulk insert is enabled for one table but disabled for another. We can't do bulk insert on vector index as it does table updates currently.	2024-11-05 14:00:50 -08:00
Sergei Golubchik	2ad9df8c9b	VEC_Distance_Cosine()	2024-11-05 14:00:50 -08:00
Sergei Golubchik	2e1fcc6a80	rename VEC_Distance to VEC_Distance_Euclidean and create a parent Item_func_vec_distance_common class	2024-11-05 14:00:50 -08:00
Sergei Golubchik	0da820cb12	mhnsw: use plugin index options and transaction_participant API	2024-11-05 14:00:50 -08:00
Sergei Golubchik	126d6d787c	cleanup: handlerton remove unused methods, reorder methods, add comments	2024-11-05 14:00:50 -08:00
Sergei Golubchik	8087cefc07	make rename test to work with InnoDB too	2024-11-05 14:00:50 -08:00
Sergei Golubchik	445198c10e	pos-fixes for rename	2024-11-05 14:00:50 -08:00
Sergey Vojtovich	97e112fb82	VECTOR indexes support for RENAME TABLE Rename high-level indexes along with a table.	2024-11-05 14:00:49 -08:00
Sergei Golubchik	ebcbed6d74	post-fixes for TRUNCATE * fix the truncate-by-handler variant, used by InnoDB * test that insert works after truncate, meaning graph table was emptied * test that the vector index size is zero after truncate in MyISAM	2024-11-05 14:00:49 -08:00
Sergey Vojtovich	70575defb7	Fixed TRUNCATE TABLE against VECTOR indexes This patch fixes only TRUNCATE by recreate variant, there seem to be no reasonable engine that uses TRUNCATE by handler method for testing. Reset index_cinfo so that mi_create is not confused by garbage passed via index_file_name and sets MY_DELETE_OLD flag. Review question: can we add a test case to make sure VECTOR index is empty indeed?	2024-11-05 14:00:49 -08:00
Sergey Vojtovich	91a24ddc5d	Test insert ... select with vector index	2024-11-05 14:00:49 -08:00
Sergey Vojtovich	4aa1968b89	Disable VECTOR indexes with partitioned tables	2024-11-05 14:00:49 -08:00
Sergey Vojtovich	7c16bba71d	CREATE TABLE ... LIKE loses VECTOR index	2024-11-05 14:00:49 -08:00
Vicențiu Ciorbaru	eec1339f5d	MDEV-32886 Vec_FromText and Vec_ToText This commit introduces two utility functions meant to make working with vectors simpler. Vec_ToText converts a binary vector into a json array of numbers (floats). Vec_FromText takes in a json array of numbers and converts it into a little-endian IEEE float sequence of bytes (4 bytes per float).	2024-11-05 14:00:49 -08:00
Sergei Golubchik	f44989ff0f	UPDATE/DELETE post-fixes	2024-11-05 14:00:49 -08:00
Hugo Wen	0e2b9e7621	MDEV-33408 Initial support for vector DELETE and UPDATE When the source row is deleted, mark the corresponding node in HNSW index by setting `tref` to null. An index is added for the `tref` in secondary table for faster searching of the to-be-marked nodes. The nodes marked as deleted will still be used for search, but will not be included in the final query results. As skipping deleted nodes and not adding deleted nodes for new-inserted nodes' neighbor list could impact the performance, we now only skip these nodes in search results. - for some reason the bitmap is not set for hlindex during the delete so I had to temporarily comment out one line All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2024-11-05 14:00:49 -08:00
Sergei Golubchik	049d839350	mhnsw: inter-statement shared cache * preserve the graph in memory between statements * keep it in a TABLE_SHARE, available for concurrent searches * nodes are generally read-only, walking the graph doesn't change them * distance to target is cached, calculated only once * SIMD-optimized bloom filter detects visited nodes * nodes are stored in an array, not List, to better utilize bloom filter * auto-adjusting heuristic to estimate the number of visited nodes (to configure the bloom filter) * many threads can concurrently walk the graph. MEM_ROOT and Hash_set are protected with a mutex, but walking doesn't need them * up to 8 threads can concurrently load nodes into the cache, nodes are partitioned into 8 mutexes (8 is chosen arbitrarily, might need tuning) * concurrent editing is not supported though * this is fine for MyISAM, TL_WRITE protects the TABLE_SHARE and the graph (note that TL_WRITE_CONCURRENT_INSERT is not allowed, because an INSERT into the main table means multiple UPDATEs in the graph) * InnoDB uses secondary transaction-level caches linked in a list in in thd->ha_data via a fake handlerton * on rollback the secondary cache is discarded, on commit nodes from the secondary cache are invalidated in the shared cache while it is exclusively locked * on savepoint rollback both caches are flushed. this can be improved in the future with a row visibility callback * graph size is controlled by @@mhnsw_cache_size, the cache is flushed when it reaches the threshold	2024-11-05 14:00:49 -08:00
Sergei Golubchik	5c2b7c6e7f	mhnsw: configurable parameters 1. introduce alpha. the value of 1.1 is optimal, so hard-code it. 2. hard-code ef_construction=10, best by test 3. rename hnsw_max_connection_per_layer to mhnsw_max_edges_per_node (max_connection is rather ambiguous in MariaDB) and add a help text 4. rename hnsw_ef_search to mhnsw_min_limit and add a help text	2024-11-05 14:00:49 -08:00
Sergei Golubchik	25b4000290	InnoDB support for hlindexes and mhnsw * mhnsw: * use primary key, innodb loves and (and the index cannot have dupes anyway) * MyISAM is ok with that, performance-wise * must be ha_rnd_init(0) because we aren't going to scan * MyISAM resets the position on ha_rnd_init(0) so query it before * oh, and use the correct handler, just in case * HA_ERR_RECORD_IS_THE_SAME is no error * innodb: * return ref_length on create * don't assume table->pos_in_table_list is set * ok, assume away, but only for system versioned tables * set alter_info on create (InnoDB needs to check for FKs) * pair external_lock/external_unlock correctly	2024-11-05 14:00:49 -08:00
Sergei Golubchik	3ff7f04fd4	misc changes * sysvars should be REQUIRED_ARG * fix a mix of US and UK spelling (use US) * use consistent naming * work if VEC_DISTANCE arguments are in the swapped order (const, col) * work if VEC_DISTANCE argument is NULL/invalid or wrong length * abort INSERT if the value is invalid or wrong length * store the "number of neighbors" in a blob in endianness-independent way * use field->store(longlong, bool) not field->store(double) * a lot more error checking everywhere * cleanup after errors * simplify calling conventions, remove reinterpret_cast's * todo/XXX comments * whitespaces * use float consistently memory management is still totally PoC quality	2024-11-05 14:00:48 -08:00
Vicențiu Ciorbaru	88839e71a3	Initial HNSW implementation This commit includes the work done in collaboration with Hugo Wen from Amazon: MDEV-33408 Alter HNSW graph storage and fix memory leak This commit changes the way HNSW graph information is stored in the second table. Instead of storing connections as separate records, it now stores neighbors for each node, leading to significant performance improvements and storage savings. Comparing with the previous approach, the insert speed is 5 times faster, search speed improves by 23%, and storage usage is reduced by 73%, based on ann-benchmark tests with random-xs-20-euclidean and random-s-100-euclidean datasets. Additionally, in previous code, vector objects were not released after use, resulting in excessive memory consumption (over 20GB for building the index with 90,000 records), preventing tests with large datasets. Now ensure that vectors are released appropriately during the insert and search functions. Note there are still some vectors that need to be cleaned up after search query completion. Needs to be addressed in a future commit. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc. As well as the commit: Introduce session variables to manage HNSW index parameters Three variables: hnsw_max_connection_per_layer hnsw_ef_constructor hnsw_ef_search ann-benchmark tool is also updated to support these variables in commit https://github.com/HugoWenTD/ann-benchmarks/commit/e09784e for branch https://github.com/HugoWenTD/ann-benchmarks/tree/mariadb-configurable All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc. Co-authored-by: Hugo Wen <wenhug@amazon.com>	2024-11-05 14:00:48 -08:00
Sergei Golubchik	d6add9a03d	initial support for vector indexes MDEV-33407 Parser support for vector indexes The syntax is create table t1 (... vector index (v) ...); limitation: * v is a binary string and NOT NULL * only one vector index per table * temporary tables are not supported MDEV-33404 Engine-independent indexes: subtable method added support for so-called "high level indexes", they are not visible to the storage engine, implemented on the sql level. For every such an index in a table, say, t1, the server implicitly creates a second table named, like, t1#i#05 (where "05" is the index number in t1). This table has a fixed structure, no frm, not accessible directly, doesn't go into the table cache, needs no MDLs. MDEV-33406 basic optimizer support for k-NN searches for a query like SELECT ... ORDER BY func() optimizer will use item_func->part_of_sortkey() to decide what keys can be used to resolve ORDER BY.	2024-11-05 14:00:48 -08:00
Sergei Golubchik	aa09cb3b11	open frm for DROP TABLE needed to get partitioning and information about secondary objects	2024-11-05 14:00:48 -08:00
Sergei Golubchik	1fe8a1bb76	cleanup: generalize ER_INNODB_NO_FT_TEMP_TABLE	2024-11-05 14:00:48 -08:00
Sergei Golubchik	fd69abe44f	cleanup: generalize ER_SPATIAL_CANT_HAVE_NULL	2024-11-05 14:00:48 -08:00
Sergei Golubchik	062f8eb37d	cleanup: key algorithm vs key flags the information about index algorithm was stored in two places inconsistently split between both. BTREE index could have key->algorithm == HA_KEY_ALG_BTREE, if the user explicitly specified USING BTREE or HA_KEY_ALG_UNDEF, if not. RTREE index had key->algorithm == HA_KEY_ALG_RTREE and always had key->flags & HA_SPATIAL FULLTEXT index had key->algorithm == HA_KEY_ALG_FULLTEXT and always had key->flags & HA_FULLTEXT HASH index had key->algorithm == HA_KEY_ALG_HASH or HA_KEY_ALG_UNDEF long unique index always had key->algorithm == HA_KEY_ALG_LONG_HASH In this commit: All indexes except BTREE and HASH always have key->algorithm set, HA_SPATIAL and HA_FULLTEXT flags are not used anymore (except for storage to keep frms backward compatible). As a side effect ALTER TABLE now detects FULLTEXT index renames correctly	2024-11-05 14:00:47 -08:00
Sergei Golubchik	44ff2f7831	reject invalid spatial key declarations in the parser	2024-11-05 14:00:47 -08:00
Sergei Golubchik	9fa31c1bd9	cleanup: spaces, casts, comments	2024-11-05 14:00:47 -08:00
Sergei Golubchik	4f4c5a2ba9	fix a typo and an old bug in prefschema.transaction test	2024-11-05 14:00:47 -08:00
Sergei Golubchik	70f000f1dc	fix main.plugin_vars test to cleanup after itself	2024-11-05 14:00:46 -08:00
Sergei Golubchik	9ddac64188	make INFORMATION_SCHEMA.STATISTICS.COMMENT not nullable as it can never be null (only "" or "disabled")	2024-11-05 14:00:46 -08:00
Sergei Golubchik	680bdb76a6	fix for 32bit	2024-11-05 14:00:46 -08:00
Vladislav Vaintroub	faf9e755ba	MDEV-35109 fix test case rpl_semi_sync_after_sync_coord_consistency fails on release compilation	2024-11-05 22:38:55 +01:00
Vladislav Vaintroub	37b7986467	Merge branch '10.5' into 10.6	2024-11-05 21:02:22 +01:00
Alexander Barkov	7741065936	MDEV-23895 Server crash, ASAN heap-buffer-overflow or Valgrind Invalid write in Item_func_rpad::val_str Item_cache_int::val_str() and Item_cache_real::val_str() erroneously used default_charset(). Fixing to return my_charset_numeric instead.	2024-11-05 12:36:08 +04:00
Oleg Smirnov	a914087fab	MDEV-35307 Unexpected error WARN_SORTING_ON_TRUNCATED_LENGTH or assertion failure in diagnostics area #2 When strict mode is enabled, all warnings during `INSERT` are converted to errors regardless of their actual severity. `WARN_SORTING_ON_TRUNCATED_LENGTH` is not considered severe enough to be elevated to the ERROR level, and this commit fixes that	2024-11-05 14:52:20 +07:00
Alexander Barkov	eb41c1171e	MDEV-33942 View cuts off the end of string with the utf8 character set in INSERT function Item_func_insert::fix_length_and_dec() incorrectly calculated max_length when its collation.collation evaluated to my_charset_bin. Fixing the code to calculate max_length in terms of octets rather than in terms of characters when collation.collation is my_charset_bin.	2024-11-05 11:16:10 +04:00
Alexander Barkov	c2bf1d4781	MDEV-29552 LEFT and RIGHT with big value for parameter 'len' >0 return empty value in view The code in max_length_for_string() erroneously returned 0 for huge numbers like 4294967295. Rewriting the code in a more straightforward way.	2024-11-05 09:19:05 +04:00
Brandon Nesterenko	b07258a0d5	MDEV-35109: Semi-sync Replication stalling Primary using wait point=AFTER_SYNC For a primary configured with wait_point=AFTER_SYNC, if two threads T1 (binlogging through MYSQL_BIN_LOG::write()) and T2 were binlogging at the same time, T1 could accidentally wait for its semi-sync ACK using the binlog coordinates of T2. Prior to MDEV-33551, this only resulted in delayed transactions, because all transactions shared the same condition variable for ACK signaling. However, with the MDEV-33551 changes, each thread has its own condition variable to signal. So T1 could wait indefinitely when either: 1) T1's ACK is received but not T2's when T1 goes into wait_after_sync(), because the ACK receiver thread has already notified about the T1 ACK, but T1 was _actually_ waiting on T2's ACK, and therefore tries to wait (in vain). 2) T1 goes to wait_after_sync() before any ACKs have arrived. When T1's ACK comes in, T1 is woken up; however, sees it needs to wait more (because it was actually waiting on T2's ACK), and goes to wait again (this time, in vain). Note that the actual cause of T1 waiting on T2's binlog coordinates is when MYSQL_BIN_LOG::write() would call Repl_semisync_master::wait_after_sync(), the binlog offset parameter was read as the end of MYSQL_BIN_LOG::log_file, which is shared among transactions. So if T2 had updated the binary log _after_ T1 had released LOCK_log, but not yet invoked wait_after_sync(), it would use the end of the binary log file as the binlog offset, which was that of T2 (or any future transaction). The fix in this patch ensures consistency between the binary log coordinates a transaction uses between report_binlog_update() and wait_after_sync(). Reviewed By ============ Kristian Nielsen <knielsen@knielsen-hq.org> Andrei Elkin <andrei.elkin@mariadb.com>	2024-11-04 10:45:58 -07:00
Brandon Nesterenko	5290fa043b	MDEV-35109 PREP: simulate_delay_semisync_slave_reply use debug_sync This is a preparatory commit for MDEV-35109 to make its testing code cleaner (and harden other tests too). The DEBUG_DBUG point simulate_delay_semisync_slave_reply up to this patch used my_sleep() to delay an ACK response, but sleeps are prone to test failures on machines that run tests when already having a heavy load (e.g. on buildbot). This patch changes this DEBUG_DBUG sleep to use DEBUG_SYNC to coordinate exactly when a slave should send its reply, which is safer and faster. As DEBUG_SYNC can't be used while a server is shutting down, to synchronize threads with SHUTDOWN WAIT FOR SLAVES logic, we use and extend wait_for_pattern_in_file.inc to wait for an informational error message in the logic to indicate that the shutdown process has reached the intended state (i.e. indicating that the shutdown has been delayed to await semi-sync ACKs). Specifically, the extensions are as follows: 1. wait_for_pattern_in_file.inc is extended with parameter wait_for_pattern_count as a number that indicates the number of times a pattern should occur in the file before return control back to the calling script. 2. search_for_pattern_in_file.inc is extended with parameter SEARCH_ABORT_IS_SUCCESS to inverse the error/success logic, so the SEARCH_ABORT condition can be used to indicate success, rather than error.	2024-11-04 10:45:58 -07:00
Oleksandr Byelkin	a37f71bd10	Merge branch '10.11' into mariadb-10.11.10	2024-11-04 07:42:26 +01:00
Oleksandr Byelkin	f2bb2ab58c	Merge branch '10.6' into mariadb-10.6.20	2024-11-04 07:40:45 +01:00
Oleksandr Byelkin	ecdccddaae	Merge branch '10.5' into mariadb-10.5.27	2024-11-04 07:35:28 +01:00
Alexander Barkov	e60fd6c204	MDEV-28767 Collation "binary" is not accepted for databases, tables, columns MariaDB in a COLLATE clause supported 'binary' only as an identifier: COLLATE `binary` Fixing the parser to understand 'binary' as a keyword: COLLATE binary This is for MySQL compatibility.	2024-11-02 12:47:28 +04:00
Alexander Barkov	5d4a4d2091	Fixing main.type_timestamp failure with --view The patch for MDEV-35250 Assertion `dec <= 6' failed in my_timestamp_binary_length added a test which depends on MDEV-29534 In view FROM_UNIXTIME adds .000000 in the result Adding --disable_view_protocol around the affected statements.	2024-11-02 12:46:27 +04:00
Sergei Golubchik	ac7fe8b214	fix main.selectivity_notembedded --view	2024-11-01 21:01:31 +01:00
Sergei Golubchik	0a3452cf83	MDEV-35229 fix the test for --view also, enable it for --ps	2024-11-01 20:52:58 +01:00
Alexander Barkov	d661bc1552	MDEV-20944 Wrong result of LEAST() and ASAN heap-use-after-free in my_strnncollsp_simple / Item::temporal_precision on TIME() The code tried to avoid String::copy() but did it in a wrong way, so asan detected heap-use-after-free errors. Removing the wrong optimization, using copy() instead.	2024-11-01 15:55:09 +04:00
Alexander Barkov	dd41be2a51	MDEV-29184 Assertion `0' in Item_row::illegal_method_call, Type_handler_row::Item_update_null_value, Item::update_null_value - Moving the check_cols(1) test from fix_fields() to fix_length_and_dec(). So the test is now done before the code calling val_decimal() in fix_length_and_dec(). - Removing Item_func_interval::fix_fields(), as it become equal to the inherited one.	2024-11-01 12:40:43 +04:00
Sergei Golubchik	947de4b1db	print more digits for floating point options in in mariadbd --help	2024-11-01 08:58:43 +01:00
Monty	40810baffe	MDEV-33144 Implement the Percona variable slow_query_log_always_write_time This task is inspired by the Percona implementation of slow_query_log_always_write_time. This task implements the variable log_slow_always_query_time (name matching other MariaDB variables using the slow query log). The default value for the variable is 31536000, which makes MariaDB compatible with older installations. For queries with execution time longer than log_slow_always_query_time the variables log_slow_rate_limit and log_slow_min_examined_row_limit will be ignored and the query will be written to the slow query log if there is no other limitations (like log_slow_filter etc). Other things: - long_query_time internal variable renamed to log_slow_query_time. - More descriptive information for "log_slow_query_time".	2024-11-01 08:58:37 +01:00
Brandon Nesterenko	e9a502df08	Testing fix for rpl_semi_sync_cond_var_per_thd failure	2024-10-30 08:32:19 -06:00
Oleksandr Byelkin	c770bce898	Merge branch '11.2' into 11.4	2024-10-30 15:11:17 +01:00
Oleg Smirnov	bf9662f6fa	MDEV-35275 Unexpected WARN_SORTING_ON_TRUNCATED_LENGTH or assertion failure in diagnostics area MDEV-27277 added warnings on truncation during sorting for SELECTs but did not for DML operations. However, UPDATEs and DELETEs may also perform sorting and thus produce warnings. This commit fixes that	2024-10-30 18:47:11 +07:00
Alexander Barkov	556a40dce0	MDEV-35229 NOCOPY has become reserved word bringing wide incompatibility This patch was suggested by Sergei Golubchik. It reverts the second patch from the PR: commit `fa5eeb4931` Fixed ALTER TABLE NOCOPY keyword failure and adds NOCOPY_SYM into keyword_func_sp_var_and_label. The price is one extra shift/recuce conflict in yy_oracle.yy. This should to tolerable.	2024-10-30 13:58:20 +04:00
Oleg Smirnov	c3a7a3c7a2	MDEV-34665 Simplify IN predicate processing for NULL-aware materialization involving only one column It was found that unnecessary work of building Ordered_key structures is being done when processing NULL-aware materialization for IN predicates having only one column. In fact, the logic for that simplified case can be expressed as follows. Say we have predicate left_expr IN (SELECT <subq1>), where left_expr is scalar(not a tuple). Then if (left_expr is NULL) { if (subq1 produced any rows) { // note that we don't care if subq1 has produced // NULLs or not. NULL IN (<some values>) -> UNKNOWN, i.e. NULL. } else { NULL IN ({empty-set}) -> FALSE. } } else { // left_expr is a non-NULL value if (subq1 output has a match for left_expr) { left_expr IN (..., left_expr ...) -> TRUE } else { // no "known" matches. if (subq1 output has a NULL) { left_expr IN ( ... NULL ...) -> (NULL could have been a match or not) -> NULL. } else { // subq1 didn't produce any "UNKNOWNs" so // we're positive there weren't any matches -> FALSE. } } } This commit introduces subselect_single_column_partial_engine class implementing the logic described. Reviewer: Sergei Petrunia <sergey@mariadb.com>	2024-10-30 16:48:36 +07:00
Alexander Barkov	4c7cfd2624	MDEV-34817 perfschema.lowercase_fs_off fails on buildbot I pushed this patch in a mistake to 11.7 instead of 11.6. Backporting from 11.7. This is a workaround patch to make buildbot green. Renaming databases from db1/DB2 to m33020_db1/m33020_DB1 to make them unique. So the garbage left by other tests does not show up any more. The real problem will be fixed under terms of: MDEV-35282 Performance schema does not clear package routines	2024-10-30 13:47:19 +04:00
Alexander Barkov	8c0a260a5b	MDEV-35250 Assertion `dec <= 6' failed in my_timestamp_binary_length The TIMESTAMP related code did not handle AUTO_SEC_PART_DIGITS. FROM_UNIXTIME() sets its member 'decimals' to AUTO_SEC_PART_DIGITS. So some scripts involving FROM_UNIXTIME() crashed on assert in debug builds and returned unexpected results in release builds.	2024-10-30 11:09:21 +04:00
Alexander Barkov	a79f314f1b	MDEV-34817 perfschema.lowercase_fs_off fails on buildbot This is a workaround patch to make buildbot green. Renaming databases from db1/DB2 to m33020_db1/m33020_DB1 to make them unique. So the garbage left by other tests does not show up any more. The real problem will be fixed under terms of: MDEV-35282 Performance schema does not clear package routines	2024-10-30 10:21:29 +04:00
Oleksandr Byelkin	69d033d165	Merge branch '10.11' into 11.2	2024-10-29 16:42:46 +01:00
Aleksey Midenkov	cc183489da	MDEV-27293 Allow converting a versioned table from implicit to explicit row_start/row_end columns In case of adding both system fields of same type (length, unsigned flag) as old implicit system fields do the rename of implicit system fields to the ones specified in ALTER, remove SYSTEM_INVISIBLE flag in that case. Correct PERIOD clause must be specified in ALTER as well. MDEV-34904 Inplace alter for implicit to explicit versioning is broken Whether ALTER goes inplace and how it goes inplace depends on handler_flags which goes from alter_info->flags by this logic: ha_alter_info->handler_flags\|= (alter_info->flags & ~flags_to_remove); ALTER_VERS_EXPLICIT was not in flags_to_remove and its value (1ULL << 35) clashed with ALTER_ADD_NON_UNIQUE_NON_PRIM_INDEX. ALTER_VERS_EXPLICIT must not affect inplace, it is SQL-only so we remove it from handler_flags.	2024-10-29 17:46:40 +03:00
Oleksandr Byelkin	3d0fb15028	Merge branch '10.6' into 10.11	2024-10-29 15:24:38 +01:00
Sergei Golubchik	5e5c3c7cb6	post-merge changes * remove duplicate test file * move all uuidv7 tests into plugin/type_uuid/mysql-test/type_uuid/ * remove mysys/ changes * auto my_random_bytes() fallback - removes duplicate code from uuid, and fixes all other users of my_random_bytes() that don't check the return value (because, perhaps, they don't need crypto-strong random bytes) * End of 11.6 -> 11.7 in tests * clarify the warning text * UUID_VERSION_MASK()/UUID_VARIANT_MASK() must not depend on the version * allow 4x more monotonic uuidv7 per millisecond - instead of stretching 1000 microseconds over 12 bits, let's use extra 2 bits as a counter * rename for compatibility with Percona Server (uuid_v4, uuid_v7)	2024-10-29 14:47:32 +01:00
StefanoPetrilli	2fe269fdcb	MDEV-32637 Implement native UUID7 function	2024-10-29 14:47:32 +01:00
Oleksandr Byelkin	f00711bba2	Merge branch '10.5' into 10.6	2024-10-29 14:20:03 +01:00
Oleksandr Byelkin	6aa47fae30	MDEV-35276 Assertion failure in find_producing_item upon a query from a view Two problem solved: 1) Item_default_value makes a shallow copy so the copy should not delete field belong to the Item 2) Item_default_value should not inherit derived_field_transformer_for_having and derived_field_transformer_for_where (in this variant pushing DEFAULT(f) is prohibited (return NULL) but if return "this" it will be allowed (should go with a lot of tests))	2024-10-29 11:44:43 +01:00
Thirunarayanan Balathandayuthapani	db3be9b434	MDEV-35237 Bulk insert fails to apply buffered operation during CREATE..SELECT statement Problem: ======= - InnoDB fails to write the buffered insert operation during create..select operation. This happens when bulk_insert in transaction is reset to false while unlocking a source table. Fix: === - InnoDB should apply the previous buffered changes to all tables if we encounter any statement other than pure INSERT or INSERT..SELECT statement in ha_innobase::external_lock() and start_stmt(). - Remove the function bulk_insert_apply_for_table() start_stmt(), external_lock(): Assert that trx->duplicates should be enabled during bulk insert operation	2024-10-29 15:03:23 +05:30
Alexander Barkov	a88c71b294	MDEV-35041 Simple comparison causes "Illegal mix of collations" even with default server settings The task "MDEV-25829 Change default Unicode collation to uca1400_ai_ci" previously changed collation derivation for string user variables from DERIVATION_EXPLICIT to DERIVATION_COERCIBLE, to resolve illegal collation mix conflicts between table columns and user variables when they have different collations. However, DERIVATION_COERCIBLE was a wrong choice because it caused conflicts between string literals and user variables when they have different collations. Adding a new collation derivation level DERIVATION_USERVAR. This makes the collation of a user variable: - weaker than a table column (like it was intended by MDEV-25829) - but stronger than a literal (like it was in pre-MDEV-25829) Cleanup in sql_type.h: Removing the line "- BINARY(expr)" from the before-DERIVATION_CAST comment, as it was on a wrong place. It's also listed on the correct place before DERIVATION_IMPLICIT.	2024-10-28 16:30:49 +04:00
Monty	066f920484	MDEV-35110 Deadlock on Replica during BACKUP STAGE BLOCK_COMMIT on XA transactions This is an extension of MDEV-30423 "Deadlock on Replica during BACKUP STAGE BLOCK_COMMIT on XA transactions" The original commit in MDEV-30423 was not complete as some usage in XA of MDL_BACKUP_COMMIT locks did not set thd->backup_commit_lock. This is required to be set when using parallel replication. Fixed by ensuring that all usage of BACKUP_COMMIT lock i XA is uniform and all sets thd->backup_commit_lock. I also changed all locks to be MDL_EXPLICIT to keep also that part uniform. A regression test is added.	2024-10-28 13:29:21 +02:00
Vladislav Vaintroub	4b068b7fcb	MDEV-32387 - prevent output during temporary changes to STDOUT/STDERR create_process() temporarily changes STDOUT/STDERR output to error log file This might redirect mtr output on Windows, so avoid it by holding flush_lock.	2024-10-28 10:45:50 +01:00
Marko Mäkelä	63a7e4c96b	MDEV-35257 Backup fails during an ALTER TABLE with FULLTEXT INDEX In commit `1c55b845e0` (MDEV-32932) the test mariabackup.innodb_ddl_on_intermediate_table was introduced but disabled. xb_load_single_table_tablespace(): Properly handle missing FTS_ tables. backup_file_op_fail(): Properly handle FILE_DELETE records.	2024-10-28 07:44:18 +02:00
Sergei Petrunia	284593413f	MDEV-35253: xa_prepare_unlock_unmodified fails: shift exponent 32 is too large The code in best_access_path() uses PREV_BITS(uint, N) to compute a bitmap of all keyparts: {keypart0, ... keypart{N-1}). The problem is that PREV_BITS($type, N) macro code can't handle the case when N=<number of bits in $type). Also, why use PREV_BITS(uint, ...) for key part map computations when we could have used PREV_BITS(key_part_map) ? Fixed both: - Change PREV_BITS(type, N) to handle any N in [0; n_bits(type)]. - Change PREV_BITS() to use key_part_map when computing key_part_map bitmaps.	2024-10-25 18:02:14 +03:00
Yuchen Pei	b8c2bd9f69	MDEV-35249 Fix regression caused by MDEV-34447 MDEV-34447 Removed setting first_cond_optimization to 0 in update and delete when leaf_tables_saved. This can cause problems when two ps executions of an update go through different paths, where the first ps execution goes through single table update only and the second ps execution also goes through multi table update. When this happens, the first_cond_optimization of the outer query is not set to false during the first ps execution because optimize() is not called for the outer query. But then the second ps execution will call optimize() on the outer query, which with first_cond_optimization==true trips the 2nd ps mem leak detection. This is not a problem in higher version as both executions go through multi table updates, possibly due to MDEV-28883. We fix this problem by restoring the FALSE assignments to first_cond_optimization.	2024-10-25 18:03:40 +11:00
Sergei Golubchik	3cd706b107	MDEV-35236 Assertion `(mem_root->flags & 4) == 0' failed in safe_lexcstrdup_root Post-fix for MDEV-35144. Cannot allocate options values on the statement arena, because HA_CREATE_INFO is shallow-copied for every execution, so if the option_list was initially empty, it will be reset for every execution and any values allocated on the statement arena will be lost. Cannot allocate option values on the execution arena, because HA_CREATE_INFO is shallow-copied for every execution, so if the option_list was initially NOT empty, any values appended to the end will be preserved and if they're on the execution arena their content will be destroyed. Let's use thd->change_item_tree() to save and restore necessary pointers for every execution. followup for `3da565c41d`	2024-10-23 14:58:57 +02:00
Sergei Golubchik	eac33a23da	MDEV-32022 ERROR 1054 (42S22): Unknown column 'X' in 'NEW' in trigger add missing do_get_copy/do_build_clone	2024-10-23 14:58:57 +02:00
Yuchen Pei	4b6922a315	MDEV-25008: UPDATE/DELETE: Cost-based choice IN->EXISTS vs Materialization Single-table UPDATE/DELETE didn't provide outer_lookup_keys value for subqueries. This didn't allow to make a meaningful choice between IN->EXISTS and Materialization strategies for subqueries. Fix this: * Make UPDATE/DELETE save Sql_cmd_dml::scanned_rows, * Then, subquery's JOIN::choose_subquery_plan() can fetch it from there for outer_lookup_keys Details: UPDATE/DELETE now calls select_lex->optimize_unflattened_subqueries() twice, like SELECT does (first call optimize_constant_subquries() in JOIN::optimize_inner(), then call optimize_unflattened_subqueries() in JOIN::optimize_stage2()): 1. Call with const_only=true before any optimizations. This allows range optimizer and others to use the values of cheap const subqueries. 2. Call it with const_only=false after range optimizer, partition pruning, etc. outer_lookup_keys value is provided, so it's possible to pick a good subquery strategy. Note: PROTECT_STATEMENT_MEMROOT requires that first SP execution performs subquery optimization for all subqueries, even for degenerate query plans like "Impossible WHERE". Due to that, we ensure that the call to optimize_unflattened_subqueries (with const_only=false) even for degenerate query plans still happens, as was the case before this change.	2024-10-23 23:51:24 +11:00
Oleg Smirnov	6bd1cb0ea0	MDEV-34880 Incorrect result for query with derived table having TEXT field When a derived table which has distinct values and BLOB fields is materialized, an index is created over all columns to ensure only unique values are placed to the result. This index is created in a special mode HA_UNIQUE_HASH to support BLOBs. Later the optimizer may incorrectly choose this index to retrieve values from the derived table, although such type of index cannot be used for data retrieval. This commit excludes HA_UNIQUE_HASH indexes from adding to `JOIN::keyuse` array thus preventing their subsequent usage for data retrieval	2024-10-23 17:55:00 +07:00
Vlad Lesin	8c7786e7d5	MDEV-34690 lock_rec_unlock_unmodified() causes deadlock lock_rec_unlock_unmodified() is executed either under lock_sys.wr_lock() or under a combination of lock_sys.rd_lock() + record locks hash table cell latch. It also requests page latch to check if locked records were changed by the current transaction or not. Usually InnoDB requests page latch to find the certain record on the page, and then requests lock_sys and/or record lock hash cell latch to request record lock. lock_rec_unlock_unmodified() requests the latches in the opposite order, what causes deadlocks. One of the possible scenario for the deadlock is the following: thread 1 - lock_rec_unlock_unmodified() is invoked under locks hash table cell latch, the latch is acquired; thread 2 - purge thread acquires page latch and tries to remove delete-marked record, it invokes lock_update_delete(), which requests locks hash table cell latch, held by thread 1; thread 1 - requests page latch, held by thread 2. To fix it we need to release lock_sys.latch and/or lock hash cell latch, acquire page latch and re-acquire lock_sys related latches. When lock_sys.latch and/or lock hash cell latch are released in lock_release_on_prepare() and lock_release_on_prepare_try(), the page on which the current lock is held, can be merged. In this case the bitmap of the current lock must be cleared, and the new lock must be added to the end of trx->lock.trx_locks list, or bitmap of already existing lock must be changed. The new field trx_lock_t::set_nth_bit_calls indicates if new locks (bits in existing lock bitmaps or new lock objects) were created during the period when lock_sys was released in trx->lock.trx_locks list iteration loop in lock_release_on_prepare() or lock_release_on_prepare_try(). And, if so, we traverse the list again. The block can be freed during pages merging, what causes assertion failure in buf_page_get_gen(), as btr_block_get() passes BUF_GET as page get mode to it. That's why page_get_mode parameter was added to btr_block_get() to pass BUF_GET_POSSIBLY_FREED from lock_release_on_prepare() and lock_release_on_prepare_try() to buf_page_get_gen(). As searching for id of trx, which modified secondary index record, is quite expensive operation, restrict its usage for master. System variable was added to remove the restriction for testing simplifying. The variable exists only either for debug build or for build with -DINNODB_ENABLE_XAP_UNLOCK_UNMODIFIED_FOR_PRIMARY option to increase the probability of catching bugs for release build with RQG. Note that the code, which does primary index lookup to find out what transaction modified secondary index record, is necessary only when there is no primary key and no unique secondary key on replica with row based replication, because only in this case extra X locks on unmodified records can be set during scan phase. Reviewed by Marko Mäkelä.	2024-10-23 12:36:17 +03:00
Vlad Lesin	92180ad513	MDEV-34466 XA prepare don't release unmodified records for some cases There is no need to exclude exclusive non-gap locks from the procedure of locks releasing on XA PREPARE execution in lock_release_on_prepare_try() after commit `17e59ed3aa` (MDEV-33454), because lock_rec_unlock_unmodified() should check if the record was modified with the XA, and release the lock if it was not. lock_release_on_prepare_try(): don't skip X-locks, let lock_rec_unlock_unmodified() to process them. lock_sec_rec_some_has_impl(): add template parameter for not acquiring trx_t::mutex for the case if a caller already holds the mutex, don't crash if lock's bitmap is clean. row_vers_impl_x_locked(), row_vers_impl_x_locked_low(): add new argument to skip trx_t::mutex acquiring. rw_trx_hash_t::validate_element(): don't acquire trx_t::mutex if the current thread already holds it. Thanks to Andrei Elkin for finding the bug. Reviewed by Marko Mäkelä, Debarun Banerjee.	2024-10-23 12:36:17 +03:00
Marko Mäkelä	1cad1dbde6	MDEV-35235 innodb_snapshot_isolation=ON fails to signal transaction rollback convert_error_code_to_mysql(): Treat DB_DEADLOCK and DB_RECORD_CHANGED in the same way, that is, signal to the SQL layer that the transaction had been rolled back.	2024-10-23 07:55:22 +03:00
Jan Lindström	b3be3c2157	MDEV-30653 : With wsrep_mode=REPLICATE_ARIA only part of mixed-engine transactions is replicated Replication of non-transactional engines is experimental and uses TOI. This naturally means that if there is open transaction with transactional engine it's changes will be rolled back. Fixed by adding error message if non-transactional engine is part of multi-engine transaction with warning. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-10-23 04:00:52 +02:00
Jan Lindström	7ffa7b6b01	MDEV-31888 : galera.galera_wan, galera.galera_vote_rejoin_* fail Clean up configuration and tests. Add wait conditions to make sure test continues from clean state. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-10-23 03:47:08 +02:00
Oleg Smirnov	fd87e01f38	MDEV-27277 Add a warning when max_sort_length is reached During a query execution some sorting and grouping operations on strings may be involved. System variable max_sort_length defines the maximum number of bytes to use when comparing strings during sorting/grouping. Thus, the comparable parts of strings may be less than their actual size, so the results of the query may be not sorted/grouped properly. To indicate that some comparisons were done on a truncated lengths, a new warning has been introduced with this commit.	2024-10-22 22:39:36 +07:00
Alexander Barkov	0d17c540a5	MDEV-27277 Add a warning when max_sort_length is reached Step#1: fixing the return type of strnxfrm() from size_t to this structure: typedef struct { size_t m_output_length; size_t m_source_length_used; uint m_warnings; } my_strnxfrm_ret_t;	2024-10-22 21:42:53 +07:00
Oleksandr Byelkin	9b3413c71f	MDEV-8578: fix galera test	2024-10-22 09:23:56 +02:00
Oleksandr Byelkin	d29611afa1	MDEV-15497 fixed outdated syntax	2024-10-22 09:12:23 +02:00
Brandon Nesterenko	1ed30e08af	MDEV-34122: Assertion `entry' failed in Active_tranx::assert_thd_is_waiter If semi-sync is switched off then on while a transaction is in-between binlogging and waiting for an ACK, the semi-sync state of the transaction is removed, leading to a debug assertion that indicates the transaction tried to wait, but cannot receive an ACK signal. More specifically, when semi-sync is switched off, the Active_tranx list is cleared (where a transaction adds an entry to this list during binlogging), and each entry in this list saves the thread which will wait for an ACK, and the thread has the COND variable to signal to wake itself. So if the entry is lost, the Ack_receiver thread won’t be able to find the thread to wake up when an ACK comes in The fix is to ensure that the entry exists before awaiting the ACK, and if there is no entry, skip the wait. In debug builds, an informative message is written explaining that the transaction is skipping its wait. Additional debug-build only logic is added to ensure that the cause of the missing entry is due to semi-sync being turned off and on Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org>	2024-10-21 15:35:54 -06:00
Alexander Barkov	855c21eb99	Recording ctype_gbk_export_import.result according to MDEV-34883	2024-10-21 14:08:00 +04:00
Thirunarayanan Balathandayuthapani	7f7d78bc18	MDEV-35183 ADD FULLTEXT INDEX unnecessarily DROPS FTS COMMON TABLES - InnoDB fulltext rebuilds the FTS COMMON table while adding the new fulltext index. This can be optimized by avoiding rebuilding the FTS COMMON table in case of FTS COMMON TABLE already exists. Reviewed-by: Marko Mäkelä <marko.makela@mariadb.com>	2024-10-21 12:27:09 +05:30
Alexander Barkov	e1cd3c4033	MDEV-12252 ROW data type for stored function return values Adding support for the ROW data type in the stored function RETURNS clause: - explicit ROW(..members...) for both sql_mode=DEFAULT and sql_mode=ORACLE CREATE FUNCTION f1() RETURNS ROW(a INT, b VARCHAR(32)) ... - anchored "ROW TYPE OF [db1.]table1" declarations for sql_mode=DEFAULT CREATE FUNCTION f1() RETURNS ROW TYPE OF test.t1 ... - anchored "[db1.]table1%ROWTYPE" declarations for sql_mode=ORACLE CREATE FUNCTION f1() RETURN test.t1%ROWTYPE ... Adding support for anchored scalar data types in RETURNS clause: - "TYPE OF [db1.]table1.column1" for sql_mode=DEFAULT CREATE FUNCTION f1() RETURNS TYPE OF test.t1.column1; - "[db1.]table1.column1" for sql_mode=ORACLE CREATE FUNCTION f1() RETURN test.t1.column1%TYPE; Details: - Adding a new sql_mode_t parameter to sp_head::create() sp_head::sp_head() sp_package::create() sp_package::sp_package() to guarantee early initialization of sp_head::m_sql_mode. Before this change, this member was not initialized at all during CREATE FUNCTION/PROCEDURE/PACKAGE statements, and was not used. Now it needs to be initialized to write properly the mysql.proc.returns column, according to the create time sql_mode. - Code refactoring to make the things simpler and functions smaller: * Adding a new method Field_row::row_create_fields(THD thd, List<Spvar_definition> list) to make a Virtual_tmp_table with Fields for ROW members from an explicit definition. * Adding a new method Field_row::row_create_fields(THD thd, const Spvar_definition &def) to make a Virtual_tmp_table with Fields for ROW members from an explicit or a table anchored definition. Adding a new method Item_args::add_array_of_item_field(THD thd, const Virtual_tmp_table &vtable) to create and array of Item_field corresponding to all Field instances in a Virtual_tmp_table Removing Item_field_row::row_create_items(). It was decomposed into the new methods described above. * Moving the code from the loop body in sp_rcontext::init_var_items() into a separate method Spvar_definition::make_item_field_row(), to make the code clearer (smaller functions). make_item_field_row() itself uses the new methods described above. - Changing the data type of sp_head::m_return_field_def from Column_definition to Spvar_definition. So now it supports not only SQL column field types, but also explicit ROW and anchored ROW data types, as well as anchored column types. - Adding a new Column_definition parameter to sp_head::create_result_field(). Before this patch, create_result_field() took the definition only from m_return_field_def. Now it's also called with a local Column_definition variable which contains the explicit definition resolved from an anchored defition. - Modifying sql_yacc.yy to support the new grammar. Adding new helper methods: * sf_return_fill_definition_row() * sf_return_fill_definition_rowtype_of() * sf_return_fill_definition_type_of() - Fixing tests in: * Virtual_tmp_table::setup_field_pointers() in sql_select.cc * Send_field::normalize() in field.h * store_column_type() to prevent calling Type_handler_row::field_type(), which is implemented a DBUG_ASSERT(0). Before this patch the affected methods and functions were called only for scalar data types. Now ROW is also possible. - Adding a new virtual method Field::cols() - Overriding methods: Item_func_sp::cols() Item_func_sp::element_index() Item_func_sp::check_cols() Item_func_sp::bring_value() to support the ROW data type. - Extending the rule sp_return_type to support * explicit ROW and anchored ROW data types * anchored scalar data types - Overriding Field_row::sql_type() to print the data type of an explicit ROW.	2024-10-21 07:59:29 +04:00
Alexander Barkov	dfaf7e2eb4	MDEV-15751 CURRENT_TIMESTAMP should return a TIMESTAMP [WITH TIME ZONE?] Changing the return type of the following functions: - CURRENT_TIMESTAMP, CURRENT_TIMESTAMP(), NOW() - SYSDATE() - FROM_UNIXTIME() from DATETIME to TIMESTAMP. Note, the old function NOW() returning DATETIME is still available as LOCALTIMESTAMP or LOCALTIMESTAMP(), e.g.: SELECT LOCALTIMESTAMP, -- DATETIME CURRENT_TIMESTAMP; -- TIMESTAMP The change in the functions return data type fixes some problems that occurred near a DST change: - Problem #1 INSERT INTO t1 (timestamp_field) VALUES (CURRENT_TIMESTAMP); INSERT INTO t1 (timestamp_field) VALUES (COALESCE(CURRENT_TIMESTAMP)); could result into two different values inserted. - Problem #2 INSERT INTO t1 (timestamp_field) VALUES (FROM_UNIXTIME(1288477526)); INSERT INTO t1 (timestamp_field) VALUES (FROM_UNIXTIME(1288477526+3600)); could result into two equal TIMESTAMP values near a DST change. Additional changes: - FROM_UNIXTIME(0) now returns SQL NULL instead of '1970-01-01 00:00:00' (assuming time_zone='+00:00') - UNIX_TIMESTAMP('1970-01-01 00:00:00') now returns SQL NULL instead of 0 (assuming time_zone='+00:00' These additional changes are needed for consistency with TIMESTAMP fields, which cannot store '1970-01-01 00:00:00 +00:00'	2024-10-19 22:48:23 +02:00
Sergei Golubchik	128fc34990	fix rdiff files in sys_var suite	2024-10-19 16:54:48 +02:00
Sergei Golubchik	15a291e4e0	MDEV-14978 fix client.client-env-variable test * fix paths to work when installed and not only from the source dir * don't use a cnf file (no need to restart the server for this) * set MYSQL_HOST to a valid hostname when testing an invalid MARIADB_HOST * use invalid ip to have clients fail quickly and not waste time on resolving the invalid hostname followup for `eedbb901e5`	2024-10-19 16:53:16 +02:00
Kristian Nielsen	abc46259c6	MDEV-34753 memory pressure - erroneous termination condition Fix race condition in test case by waiting for the expected state to occur. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-10-19 17:20:27 +11:00
Daniel Black	eb29190398	MDEV-34753 memory pressure - erroneous termination condition The 'if (!m_abort) break' condition was inverted by accident. Constrain the test case to environments where there is cgroupv2 runtime environment which is the same case that will pass a memory pressure initialization. Remove the explicit garbage_collection trigger as it hides the abnormal termination error on the event loop for memory pressure. This also means there is no support in non-cgroupv2 environments (possibly some container environments). As the trigger to memory pressure is via a different thread we need to wait until a "[mM]emory pressure" log message is there to know it has succeeded or failed. Thanks Kristian Nielsen for noticing and review.	2024-10-19 17:20:27 +11:00
Sergei Petrunia	a68e74b5a4	MDEV-35164: optimizer_join_limit_pref_ratio: assertion when the ORDER BY table becomes constant Assertion failure has happened due to this scenario: A query was ran with optimizer_join_limit_pref_ratio=1. The query had "ORDER BY t1.col LIMIT N". The optimizer set join->limit_shortcut_applicable=1. Then, table t1 was marked as constant. The code in choose_query_plan() still set join->limit_optimization_mode=1 which caused the optimizer to only consider t1 as the first non-const table. But t1 was already put into the join prefix as the constant table. The optimizer couldn't produce any join order at all and crashed. Fixed by not searching for shortcut plan if ORDER BY table is a constant. We will not try to do sorting anyway in this case (and LIMIT short-cutting will be done for any join order).	2024-10-18 15:42:05 +03:00
Rucha Deodhar	e14d2b7974	MDEV-8578: Wrong error code/message with enforce_storage_engine and NO_ENGINE_SUBSTITUTION Analysis: When the error is hit, wrong error code is passed in my_error Fix: Pass a better error code.	2024-10-18 16:42:52 +05:30
Sergei Petrunia	0540eac05c	MDEV-35180: ref_to_range rewrite causes poor query plan (Variant 2: only allow rewrite for ref(const)) make_join_select() has a "ref_to_range" rewrite: it would rewrite any ref access to a range access on the same index if the latter uses more keyparts. It seems, he initial intent of this was to fix poor query plan choice in cases like t.keypart1=const AND t.keypart2 < 'foo' Due to deficiency in cost model, ref access could be picked while range would enumerate fewer rows and be cheaper. However, the condition also forces a rewrite in cases like: t.keypart1=prev_table.col AND t.keypart1<='foo' AND t.keypart2<'bar' Here, it can be that * keypart1=prev_table.col is highly selective * (keypart1, keypart2) <= ('foo', 'bar') is not at all selective. Still, the rewrite would be made and poor query plan chosen. Fixed this by only doing the rewrite if ref access was ref(const) so we can be certain that quick select also used these restrictions and will scan a subset of rows that ref access would scan.	2024-10-18 13:37:04 +03:00
Marko Mäkelä	ebefef658e	Merge 10.11 into 11.2	2024-10-18 11:32:22 +03:00
Marko Mäkelä	eca552a1a4	MDEV-34830: LSN in the future is not being treated as serious corruption The invariant of write-ahead logging is that before any change to a page is written to the data file, the corresponding log record must must first have been durably written. In crash recovery, there were some sloppy checks for this. Let us implement accurate checks and flag an inconsistency as a hard error, so that we can avoid further corruption of a corrupted database. For data extraction from the corrupted database, innodb_force_recovery can be used. Before recovery is reading any data pages or invoking buf_dblwr_t::recover() to recover torn pages from the doublewrite buffer, InnoDB will have parsed the log until the final LSN and updated log_sys.lsn to that. So, we can rely on log_sys.lsn at all times. The doublewrite buffer recovery has been refactored in such a way that the recv_sys.dblwr.pages may be consulted while discovering files and their page sizes, but nothing will be written back to data files before buf_dblwr_t::recover() is invoked. recv_max_page_lsn, recv_lsn_checks_on: Remove. recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging condition at the end of the recovery. recv_dblwr_t::validate_page(): Keep track of the maximum LSN (if we are checking a non-doublewrite copy of a page) but do not complain LSN being in the future. The doublewrite buffer is a special case, because it will be read early during recovery. Besides, starting with commit `762bcb81b5` the dblwr=true copies of pages may legitimately be "too new". recv_dblwr_t::find_page(): Find a valid page with the smallest FIL_PAGE_LSN that is in the valid range for recovery. recv_dblwr_t::restore_first_page(): Replaced by find_page(). Only buf_dblwr_t::recover() will write to data files. buf_dblwr_t::recover(): Simplify the message output. Do attempt doublewrite recovery on user page read error. Ignore doublewrite pages whose FIL_PAGE_LSN is outside the usable bounds. Previously, we could wrongly recover a too new page from the doublewrite buffer. It is unlikely that this could have lead to an actual error. Write back all recovered pages from the doublewrite buffer here, including for the first page of any tablespace. buf_page_is_corrupted(): Distinguish the return values CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER. buf_page_check_corrupt(): Return the error code DB_CORRUPTION in case the LSN is in the future. Datafile::read_first_page_flags(): Split from read_first_page(). Take a copy of the first page as a parameter. recv_sys_t::free_corrupted_page(): Take the file as a parameter and return whether a message was displayed. This avoids some duplicated and incomplete error messages. buf_page_t::read_complete(): Remove some redundant output and always display the name of the corrupted file. Never return DB_FAIL; use it only in internal error handling. IORequest::read_complete(): Assume that buf_page_t::read_complete() will have reported any error. fil_space_t::set_corrupted(): Return whether this is the first time the tablespace had been flagged as corrupted. Datafile::validate_first_page(), fil_node_open_file_low(), fil_node_open_file(), fil_space_t::read_page0(), fil_node_t::read_page0(): Add a parameter for a copy of the first page, and a parameter to indicate whether the FIL_PAGE_LSN check should be suppressed. Before buf_dblwr_t::recover() is invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the FSP_SPACE_FLAGS and the tablespace ID that may be present in a potentially too new copy of a page. Reviewed by: Debarun Banerjee	2024-10-18 10:12:47 +03:00
Brandon Nesterenko	e213e916ad	MDEV-32014: Fix mysqld--help,win.rdiff	2024-10-17 15:53:00 -06:00
Sergei Golubchik	3a1cf2c85b	MDEV-34679 ER_BAD_FIELD uses non-localizable substrings	2024-10-17 21:37:37 +02:00
Sergei Golubchik	3693fb9581	MDEV-25199 cluster fails to start up if you need innodb in your test - enable it yourself	2024-10-17 21:37:37 +02:00
Sergei Golubchik	e1e836fc76	update results after the merge	2024-10-17 21:37:37 +02:00
Sergei Golubchik	3da565c41d	MDEV-35144 CREATE TABLE ... LIKE uses current innodb_compression_default instead of the create value When adding a column or index that uses plugin-defined sysvar-based options with CREATE ... LIKE the server was using the current value of the sysvar, not the default one. Because parse_option_list() function was used both in create and open and it tried to guess when it's create (need to use current sysvar value and add a new name=value pair to the list) or open (need to use default, without extending the list). Let's move the list extending functionality into a separate function and call it explicitly when needed. Operations that add new objects (CREATE, ALTER ... ADD) will extend the list, other operations (ALTER, CREATE ... LIKE, open) will not.	2024-10-17 16:28:39 +02:00
Marko Mäkelä	bb47e575de	MDEV-34830: LSN in the future is not being treated as serious corruption The invariant of write-ahead logging is that before any change to a page is written to the data file, the corresponding log record must must first have been durably written. On crash recovery, there were some sloppy checks for this. Let us implement accurate checks and flag an inconsistency as a hard error, so that we can avoid further corruption of a corrupted database. For data extraction from the corrupted database, innodb_force_recovery can be used. Before recovery is reading any data pages or invoking buf_dblwr_t::recover() to recover torn pages from the doublewrite buffer, InnoDB will have parsed the log until the final LSN and updated log_sys.lsn to that. So, we can rely on log_sys.lsn at all times. The doublewrite buffer recovery has been refactored in such a way that the recv_sys.dblwr.pages may be consulted while discovering files and their page sizes, but nothing will be written back to data files before buf_dblwr_t::recover() is invoked. A section of the test mariabackup.innodb_redo_overwrite that is parsing some mariadb-backup --backup output has been removed, because that output "redo log block is overwritten" would often be missing in a Microsoft Windows environment as a result of these changes. recv_max_page_lsn, recv_lsn_checks_on: Remove. recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging condition at the end of the recovery. recv_dblwr_t::validate_page(): Keep track of the maximum LSN (if we are checking a non-doublewrite copy of a page) but do not complain LSN being in the future. The doublewrite buffer is a special case, because it will be read early during recovery. Besides, starting with commit `762bcb81b5` the dblwr=true copies of pages may legitimately be "too new". recv_dblwr_t::find_page(): Find a valid page with the smallest FIL_PAGE_LSN that is in the valid range for recovery. recv_dblwr_t::restore_first_page(): Replaced by find_page(). Only buf_dblwr_t::recover() will write to data files. buf_dblwr_t::recover(): Simplify the message output. Do attempt doublewrite recovery on user page read error. Ignore doublewrite pages whose FIL_PAGE_LSN is outside the usable bounds. Previously, we could wrongly recover a too new page from the doublewrite buffer. It is unlikely that this could have lead to an actual error. Write back all recovered pages from the doublewrite buffer here, including for the first page of any tablespace. buf_page_is_corrupted(): Distinguish the return values CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER. buf_page_check_corrupt(): Return the error code DB_CORRUPTION in case the LSN is in the future. Datafile::read_first_page(): Handle FSP_SPACE_FLAGS=0xffffffff in the same way on both 32-bit and 64-bit architectures. Datafile::read_first_page_flags(): Split from read_first_page(). Take a copy of the first page as a parameter. recv_sys_t::free_corrupted_page(): Take the file as a parameter and return whether a message was displayed. This avoids some duplicated and incomplete error messages. buf_page_t::read_complete(): Remove some redundant output and always display the name of the corrupted file. Never return DB_FAIL; use it only in internal error handling. IORequest::read_complete(): Assume that buf_page_t::read_complete() will have reported any error. fil_space_t::set_corrupted(): Return whether this is the first time the tablespace had been flagged as corrupted. Datafile::validate_first_page(), fil_node_open_file_low(), fil_node_open_file(), fil_space_t::read_page0(), fil_node_t::read_page0(): Add a parameter for a copy of the first page, and a parameter to indicate whether the FIL_PAGE_LSN check should be suppressed. Before buf_dblwr_t::recover() is invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the FSP_SPACE_FLAGS and the tablespace ID that may be present in a potentially too new copy of a page. Reviewed by: Debarun Banerjee	2024-10-17 17:24:20 +03:00
Brandon Nesterenko	39cce39ae1	MDEV-32014: typo fix in test	2024-10-17 07:54:09 -06:00
Sergei Golubchik	70aa713f58	MDEV-32014 test fix	2024-10-17 07:53:59 -06:00
Libing Song	72cc58bb71	MDEV-32014 Rename binlog cache temporary file to binlog file for large transaction Description =========== When a transaction commits, it copies the binlog events from binlog cache to binlog file. Very large transactions (eg. gigabytes) can stall other transactions for a long time because the data is copied while holding LOCK_log, which blocks other commits from binlogging. The solution in this patch is to rename the binlog cache file to a binlog file instead of copy, if the commiting transaction has large binlog cache. Rename is a very fast operation, it doesn't block other transactions a long time. Design ====== * binlog_large_commit_threshold type: ulonglong scope: global dynamic: yes default: 128MB Only the binlog cache temporary files large than 128MB are renamed to binlog file. * #binlog_cache_files directory To support rename, all binlog cache temporary files are managed as normal files now. `#binlog_cache_files` directory is in the same directory with binlog files. It is created at server startup if it doesn't exist. Otherwise, all files in the directory is deleted at startup. The temporary files are named with ML_ prefix and the memorary address of the binlog_cache_data object which guarantees it is unique. * Reserve space To supprot rename feature, It must reserve enough space at the begin of the binlog cache file. The space is required for Format description, Gtid list, checkpoint and Gtid events when renaming it to a binlog file. Since binlog_cache_data's cache_log is directly accessed by binlog log, online alter and wsrep. It is not easy to update all the code. Thus binlog cache will not reserve space if it is not session binlog cache or wsrep session is enabled. - m_file_reserved_bytes Stores the bytes reserved at the begin of the cache file. It is initialized in write_prepare() and cleared by reset(). The reserved file header is hide to callers. Thus there is no change for callers. E.g. - get_byte_position() still get the length of binlog data written to the cache, but not the file length. - truncate(0) will truncate the file to m_file_reserved_bytes but not 0. - write_prepare() write_prepare() is called everytime when anything is being written into the cache. It will call init_file_reserved_bytes() to create the cache file (if it doesn't exist) and reserve suitable space if the data written exceeds buffer's size. * Binlog_commit_by_rotate It is used to encapsulate the code for remaing a binlog cache tempoary file to binlog file. - should_commit_by_rotate() it is called by write_transaction_to_binlog_events() to check if a binlog cache should be rename to a binlog file. - commit() That is the entry to rename a binlog cache and commit the transaction. Both rename and commit are protected by LOCK_log, Thus not other transactions can write anything into the renamed binlog before it. Rename happens in a rotation. After the new binlog file is generated, replace_binlog_file() is called to: - copy data from the new binlog file to its binlog cache file. - write gtid event. - rename the binlog cache file to binlog file. After that the rotation will continue to succeed. Then the transaction is committed in a seperated group itself. Its cache file will be detached and cache log will be reset before calling trx_group_commit_with_engines(). Thus only Xid event be written.	2024-10-17 07:53:59 -06:00
Oleksandr Byelkin	600c42ea86	MDEV-34883 LOAD DATA INFILE with geometry data fails We write field using field data charset, so we should read it using the field charset.	2024-10-17 10:33:36 +02:00
Sergei Golubchik	3b58c6b93f	MDEV-35079 Migrate MySQL5.7 to MariaDB 10.4, then to MariaDB 10.11 Failed correctly detect when partitioning is disabled	2024-10-17 10:08:24 +02:00
Sergei Golubchik	7842cab8c0	MDEV-34318 post-merge fix	2024-10-17 10:08:24 +02:00
Sergei Golubchik	6b436cba01	Revert "Fixes buildbot issue with plugin.fulltext_plugin" This reverts commit `a8010e7689`. The test doesn't require embedded after `ab15628bbc`	2024-10-17 09:11:47 +02:00
Thirunarayanan Balathandayuthapani	4a1ded61a4	MDEV-34529 Shrink the system tablespace when system tablespace contains MDEV-30671 leaked undo pages - InnoDB fails to shrink the system tablespace when it contains the leaked undo log pages caused by MDEV-30671. - InnoDB does free the unused segment in system tablespace before shrinking the tablespace. InnoDB fails to free the unused segment if XA PREPARE transaction exist or if the previous shutdown was not with innodb_fast_shutdown=0 inode_info: Structure to store the inode page and offsets. fil_space_t::garbage_collect(): Frees the system tablespace unused segment fsp_get_sys_used_segment(): Iterates through all default file segment and index segment present in system tablespace. trx_sys_t::is_xa_exist(): Returns true if the XA transaction exist in the undo logs fseg_inode_free(): Frees the extents, fragment pages for the given index node and ignores any error similar to trx_purge_free_segment() trx_sys_t::reset_page(): Retain the TRX_SYS_FSEG_HEADER value in trx_sys page while resetting the page.	2024-10-16 21:34:24 +05:30
Monty	4955f6018a	MDEV-29351 SIGSEGV when doing forward reference of item in select list The reason for the crash was the code assumed that SELECT_LEX.ref_pointer_array would be initialized with zero, which was not the case. This cause the test of if (!select->ref_pointer_array[counter]) in item.cc to be unpredictable and causes crashes. Fixed by zero-filling ref_pointer_array on allocation.	2024-10-16 17:24:46 +03:00
Monty	0de2613e7a	Fixed that SHOW CREATE TABLE for sequences shows used table options	2024-10-16 17:24:46 +03:00
Monty	2c52fdd28a	MDEV-32350 Can't selectively restore sequences using innodb tables from backup Added support for sequences to do discard and import tablespace	2024-10-16 17:24:46 +03:00
Monty	ee908140ac	Fixed bug in main.connect test where Connection_errors showed wrong value	2024-10-16 17:24:46 +03:00
Monty	a8010e7689	Fixes buildbot issue with plugin.fulltext_plugin The test is using features not in the embedded server. Fixed by including not_embedded.inc	2024-10-16 17:24:46 +03:00
Vladislav Vaintroub	c1fc59277a	MDEV-34929 page-compressed tables do not work on Windows Remove workaround for MDEV-13941, it served for 5 years,and all affected pre-release 10.2 installation should have been already fixed in between. Apparently Innodb is using is_sparse parameter in os_file_set_size() inconsistently, and it passes is_sparse=false now during first file extension. With MDEV-13941 workaround in place, it would unsparse the file, which is makes compression not to work at all anymore.	2024-10-16 16:02:13 +02:00
Sergei Petrunia	89493a9980	MDEV-34993: fix merge into 10.6: OPTIMIZER_ADJ_FIX_CARD_MULT should be ON by default	2024-10-16 16:46:38 +03:00
Sergei Golubchik	5ebda30ccc	Revert "MDEV-35019 Provide a way to enable "rollback XA on disconnect" behavior we had before 10.5.2" This reverts commit `8ae462a220`.	2024-10-16 13:23:47 +02:00
Kristian Nielsen	8ae462a220	MDEV-35019 Provide a way to enable "rollback XA on disconnect" behavior we had before 10.5.2 Implement variable legacy_xa_rollback_at_disconnect to support backwards compatibility for applications that rely on the pre-10.5 behavior for connection disconnect, which is to rollback the transaction (in violation of the XA specification). Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-10-16 10:18:36 +02:00
Sergei Petrunia	9849e3f948	MDEV-35072: Assertion with optimizer_join_limit_pref_ratio and 1-table select Pre-11.0 variant: 1. In recompute_join_cost_with_limit(), add an assertion that that partial_join_cost >= 0.0. 2. best_extension_by_limited_search() subtracts COST_EPS from join->best_read. But it is not subtracted from join->positions[0].read_time, add it back. 2. We could get very small negative partial_join_cost due to rounding errors. For fraction=1.0, we were computing essentially this (denote as EXPR-1): $row_read_cost + $where_cost - ($row_read_cost + $where_cost) which should compute to 0. But the computation was done in the following order (left-to-right): EXPR-2: ($row_read_cost + $where_cost) - $row_read_cost - $where_cost this produced a value of -1.1102230246251565e-16 due to a rounding error. Change the computation use EXPR-1 instead of EXPR-2.	2024-10-15 15:56:41 +03:00
Sergei Petrunia	66b8d32b75	MDEV-35072: Assertion with optimizer_join_limit_pref_ratio and 1-table select Variant for 11.2+: In recompute_join_cost_with_limit(), do not subtract the cost of checking the WHERE: pos->records_read* WHERE_COST_THD(join->thd) It is already included in pos->read_time. Also added comments about difference between this fix and the pre-11.2 variant.	2024-10-15 15:01:29 +03:00

... 2 3 4 5 6 ...

79728 commits