mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-29 02:05:57 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	15700f54c2	Merge 11.4 into 11.7	2025-01-09 09:41:38 +02:00
Marko Mäkelä	17f01186f5	Merge 10.11 into 11.4	2025-01-09 07:58:08 +02:00
Marko Mäkelä	420d9eb27f	Merge 10.6 into 10.11	2025-01-08 12:51:26 +02:00
Marko Mäkelä	f20ee931d8	Merge 10.5 into 10.6 Note: Changes to the test innodb.stats_persistent in commit `e5c4c0842d` (MDEV-35443) are not merged, because the test scenario is impossible due to commit `e66928ab28` (MDEV-33462).	2025-01-03 09:10:25 +02:00
Marko Mäkelä	3f914afd3a	Merge 10.6 into 10.11	2025-01-02 12:39:56 +02:00
Oleg Smirnov	24e5d56400	MDEV-35680 Table number > MAX_TABLES causes overflow of table_map at main.join test Fix a regression introduced by commit `d98ac851` (MDEV-29935, MDEV-26247) causing MAX_TABLES overflow in `setup_table_map()`. The check for MAX_TABLES was moved outside of the loop that increments table numbers, allowing overflows during loop iterations. Since setup_table_map() operates on a 64-bit bitmap, table numbers exceeding 64 triggered the UBSAN check. This commit returns the overflow check within the loop and adds a debug assertion to `setup_table_map()` to ensure no bitmap overrun occurs.	2024-12-24 15:54:56 +07:00
Yuchen Pei	671f80c738	Merge branch '10.5' into 10.6	2024-12-17 11:06:09 +11:00
Oleg Smirnov	d98ac8511e	MDEV-26247 MariaDB Server SEGV on INSERT .. SELECT This problem occured for statements like `INSERT INTO t1 SELECT 1`, which do not have tables in the SELECT part. In such scenarios SELECT_LEX::insert_tables was not properly set at `setup_tables()`, and this led to either incorrect execution or a crash Reviewer: Oleksandr Byelkin <sanja@mariadb.com>	2024-12-14 14:04:21 +07:00
Oleg Smirnov	e640373389	Revert "MDEV-26427 MariaDB Server SEGV on INSERT .. SELECT" This reverts commit `49e14000ee` as it introduces regression MDEV-29935 and has to be reconsidered in general	2024-12-14 13:08:17 +07:00
Dmitry Shulga	54c1031b74	MDEV-34958: after Trigger doesn't work correctly with bulk insert This bug has the same nature as the issues MDEV-34718: Trigger doesn't work correctly with bulk update MDEV-24411: Trigger doesn't work correctly with bulk insert To fix the issue covering all use cases, resetting the thd->bulk_param temporary to the value nullptr before invoking triggers and restoring its original value on finishing execution of a trigger is moved to the method Table_triggers_list::process_triggers that be invoked ultimately for any kind of triggers.	2024-12-13 16:19:39 +07:00
Oleksandr Byelkin	b12ff287ec	Merge branch '11.6' into 11.7	2024-11-10 19:22:21 +01:00
Oleksandr Byelkin	9e1fb104a3	MariaDB 11.4.4 release -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEF39AEP5WyjM2MAMF8WVvJMdM0dgFAmck77AACgkQ8WVvJMdM 0dgccQ/+Lls8fWt4D+gMPP7x+drJSO/IE/gZFt3ugbWF+/p3B2xXAs5AAE83wxEh QSbp4DCkb/9PnuakhLmzg0lFbxMUlh4rsJ1YyiuLB2J+YgKbAc36eQQf+rtYSipd DT5uRk36c9wOcOXo/mMv4APEvpPXBIBdIL4VvpKFbIOE7xT24Sp767zWXdXqrB1f JgOQdM2ct+bvSPC55oZ5p1kqyxwvd6K6+3RB3CIpwW9zrVSLg7enT3maLjj/761s jvlRae+Cv+r+Hit9XpmEH6n2FYVgIJ3o3WhdAHwN0kxKabXYTg7OCB7QxDZiUHI9 C/5goKmKaPB1PCQyuTQyLSyyK9a8nPfgn6tqw/p/ZKDQhKT9sWJv/5bSWecrVndx LLYifSTrFC/eXLzgPvCnNv/U8SjsZaAdMIKS681+qDJ0P5abghUIlGnMYTjYXuX1 1B6Vrr0bdrQ3V1CLB3tpkRjpUvicrsabtuAUAP65QnEG2G9UJXklOer+DE291Gsl f1I0o6C1zVGAOkUUD3QEYaHD8w7hlvyfKme5oXKUm3DOjaAar5UUKLdr6prxRZL4 ebhmGEy42Mf8fBYoeohIxmxgvv6h2Xd9xCukgPp8hFpqJGw8abg7JNZTTKH4h2IY J51RpD10h4eoi6WRn3opEcjexTGvZ+xNR7yYO5WxWw6VIre9IUA= =s+WW -----END PGP SIGNATURE----- Merge tag '11.4' into 11.6 MariaDB 11.4.4 release	2024-11-08 07:17:00 +01:00
Sergei Golubchik	784becf3e1	MDEV-35267 Server crashes in _ma_reset_history upon altering on Aria table with vector key under lock ALTER TABLE needs to open hlindex tables early enough, right after they were created, so that cleanup after an error would see and delete them. But they need to be external_lock-ed only in copy_data_between_tables, after mysql_trans_prepare_alter_copy_data(). Let's move locking out of hlindex_open() into hlindex_lock()	2024-11-05 14:00:52 -08:00
Sergei Golubchik	f6de9a379a	MDEV-34919 post-fix * add Aria truncate checks * do store_lock() with a correct TL_xxx level * remove InnoDB workaround for missing store_lock (from MDEV-35032) * don't start transaction in temp tables (for Aria, with a test case)	2024-11-05 14:00:52 -08:00
Sergey Vojtovich	1cc7ef52e3	MDEV-34919 Aria crashes with high-level (vector) indexes Since high-level index tables do not participate in thr_multi_lock(), added explicit call to THR_LOCK::start_trans(). This is needed mostly for Aria to handle transaction logging.	2024-11-05 14:00:52 -08:00
Sergei Golubchik	9f80e3fbb7	MDEV-35032 streaming mode for mhnsw search support SQL semantics for SELECT ... WHERE ... ORDER BY ... LIMIT * switch from returning k nearest neighbors to returning as many as needed, in k-neighbor chunks, with increasing distance * make search_layer() skips nodes that are closer than a threshold * read_next keeps a search context - list of k found nodes, threshold, ctx, etc. * when the list of found nodes is exhausted, it repeats the search starting from last found nodes and a threshold * search context kepts ctx->refcount incremented, so ctx won't go away * but commit_lock is unlocked between calls, so InnoDB can modify the table * use ctx version to detect that, switch to MHNSW_Trx when it happens bugfix: * use the correct lock in ha_external_lock() for the graph table * InnoDB didn't reset locks on ha_external_lock(F_UNLCK) and previous LOCK_X leaked into the next statement	2024-11-05 14:00:51 -08:00
Sergei Golubchik	97b2392ede	cleanup: TABLE_SHARE::lock_share() helper also: renames, s/const/constexpr/ for consistency	2024-11-05 14:00:50 -08:00
Sergey Vojtovich	3283688797	Simplified quick_rm_table() and mysql_rename_table() Replaced obscure FRM_ONLY, NO_FRM_RENAME, NO_HA_TABLE, NO_PAR_TABLE with straightforward explicit flags: QRMT_FRM - [re]moves .frm QRMT_PAR - [re]moves .par QRMT_HANDLER - calls ha_delete_table()/ha_rename_table() and [re]moves high-level indexes QRMT_DEFAULT - same as QRMT_FRM \| QRMT_HANDLER, which is regular table drop/rename.	2024-11-05 14:00:50 -08:00
Sergey Vojtovich	a90fa3f397	ALTER TABLE fixes for high-level indexes (i) Fixes for ALTER TABLE ... ADD/DROP COLUMN, ALGORITHM=COPY. Let quick_rm_table() remove high-level indexes along with original table. Avoid locking uninitialized LOCK_share for INTERNAL_TMP_TABLEs. Don't enable bulk insert when altering a table containing vector index. InnoDB can't handle situation when bulk insert is enabled for one table but disabled for another. We can't do bulk insert on vector index as it does table updates currently.	2024-11-05 14:00:50 -08:00
Sergei Golubchik	ebcbed6d74	post-fixes for TRUNCATE * fix the truncate-by-handler variant, used by InnoDB * test that insert works after truncate, meaning graph table was emptied * test that the vector index size is zero after truncate in MyISAM	2024-11-05 14:00:49 -08:00
Sergei Golubchik	f44989ff0f	UPDATE/DELETE post-fixes	2024-11-05 14:00:49 -08:00
Hugo Wen	0e2b9e7621	MDEV-33408 Initial support for vector DELETE and UPDATE When the source row is deleted, mark the corresponding node in HNSW index by setting `tref` to null. An index is added for the `tref` in secondary table for faster searching of the to-be-marked nodes. The nodes marked as deleted will still be used for search, but will not be included in the final query results. As skipping deleted nodes and not adding deleted nodes for new-inserted nodes' neighbor list could impact the performance, we now only skip these nodes in search results. - for some reason the bitmap is not set for hlindex during the delete so I had to temporarily comment out one line All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2024-11-05 14:00:49 -08:00
Sergei Golubchik	049d839350	mhnsw: inter-statement shared cache * preserve the graph in memory between statements * keep it in a TABLE_SHARE, available for concurrent searches * nodes are generally read-only, walking the graph doesn't change them * distance to target is cached, calculated only once * SIMD-optimized bloom filter detects visited nodes * nodes are stored in an array, not List, to better utilize bloom filter * auto-adjusting heuristic to estimate the number of visited nodes (to configure the bloom filter) * many threads can concurrently walk the graph. MEM_ROOT and Hash_set are protected with a mutex, but walking doesn't need them * up to 8 threads can concurrently load nodes into the cache, nodes are partitioned into 8 mutexes (8 is chosen arbitrarily, might need tuning) * concurrent editing is not supported though * this is fine for MyISAM, TL_WRITE protects the TABLE_SHARE and the graph (note that TL_WRITE_CONCURRENT_INSERT is not allowed, because an INSERT into the main table means multiple UPDATEs in the graph) * InnoDB uses secondary transaction-level caches linked in a list in in thd->ha_data via a fake handlerton * on rollback the secondary cache is discarded, on commit nodes from the secondary cache are invalidated in the shared cache while it is exclusively locked * on savepoint rollback both caches are flushed. this can be improved in the future with a row visibility callback * graph size is controlled by @@mhnsw_cache_size, the cache is flushed when it reaches the threshold	2024-11-05 14:00:49 -08:00
Sergei Golubchik	25b4000290	InnoDB support for hlindexes and mhnsw * mhnsw: * use primary key, innodb loves and (and the index cannot have dupes anyway) * MyISAM is ok with that, performance-wise * must be ha_rnd_init(0) because we aren't going to scan * MyISAM resets the position on ha_rnd_init(0) so query it before * oh, and use the correct handler, just in case * HA_ERR_RECORD_IS_THE_SAME is no error * innodb: * return ref_length on create * don't assume table->pos_in_table_list is set * ok, assume away, but only for system versioned tables * set alter_info on create (InnoDB needs to check for FKs) * pair external_lock/external_unlock correctly	2024-11-05 14:00:49 -08:00
Sergei Golubchik	613542dceb	mhnsw: build indexes with the columns of exactly right size	2024-11-05 14:00:49 -08:00
Vicențiu Ciorbaru	88839e71a3	Initial HNSW implementation This commit includes the work done in collaboration with Hugo Wen from Amazon: MDEV-33408 Alter HNSW graph storage and fix memory leak This commit changes the way HNSW graph information is stored in the second table. Instead of storing connections as separate records, it now stores neighbors for each node, leading to significant performance improvements and storage savings. Comparing with the previous approach, the insert speed is 5 times faster, search speed improves by 23%, and storage usage is reduced by 73%, based on ann-benchmark tests with random-xs-20-euclidean and random-s-100-euclidean datasets. Additionally, in previous code, vector objects were not released after use, resulting in excessive memory consumption (over 20GB for building the index with 90,000 records), preventing tests with large datasets. Now ensure that vectors are released appropriately during the insert and search functions. Note there are still some vectors that need to be cleaned up after search query completion. Needs to be addressed in a future commit. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc. As well as the commit: Introduce session variables to manage HNSW index parameters Three variables: hnsw_max_connection_per_layer hnsw_ef_constructor hnsw_ef_search ann-benchmark tool is also updated to support these variables in commit https://github.com/HugoWenTD/ann-benchmarks/commit/e09784e for branch https://github.com/HugoWenTD/ann-benchmarks/tree/mariadb-configurable All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc. Co-authored-by: Hugo Wen <wenhug@amazon.com>	2024-11-05 14:00:48 -08:00
Sergei Golubchik	d6add9a03d	initial support for vector indexes MDEV-33407 Parser support for vector indexes The syntax is create table t1 (... vector index (v) ...); limitation: * v is a binary string and NOT NULL * only one vector index per table * temporary tables are not supported MDEV-33404 Engine-independent indexes: subtable method added support for so-called "high level indexes", they are not visible to the storage engine, implemented on the sql level. For every such an index in a table, say, t1, the server implicitly creates a second table named, like, t1#i#05 (where "05" is the index number in t1). This table has a fixed structure, no frm, not accessible directly, doesn't go into the table cache, needs no MDLs. MDEV-33406 basic optimizer support for k-NN searches for a query like SELECT ... ORDER BY func() optimizer will use item_func->part_of_sortkey() to decide what keys can be used to resolve ORDER BY.	2024-11-05 14:00:48 -08:00
Sergei Golubchik	08a7f18b19	cleanup: init_tmp_table_share(bool thread_specific) let the caller tell init_tmp_table_share() whether the table should be thread_specific or not. In particular, internal tmp tables created in the slave thread are perfectly thread specific	2024-11-05 14:00:48 -08:00
Sergei Golubchik	44c6328cbb	cleanup: thd->alloc<>() and thd->calloc<>() create templates thd->alloc<X>(n) to use instead of (X)thd->alloc(sizeof(X)n) and the same for thd->calloc(). By the default the type is char, so old usage of thd->alloc(size) works too.	2024-11-05 14:00:48 -08:00
Sergei Golubchik	07ec1a9e37	cleanup: unused function argument	2024-11-05 14:00:48 -08:00
Oleksandr Byelkin	c770bce898	Merge branch '11.2' into 11.4	2024-10-30 15:11:17 +01:00
Oleksandr Byelkin	69d033d165	Merge branch '10.11' into 11.2	2024-10-29 16:42:46 +01:00
Oleksandr Byelkin	3d0fb15028	Merge branch '10.6' into 10.11	2024-10-29 15:24:38 +01:00
Oleksandr Byelkin	f00711bba2	Merge branch '10.5' into 10.6	2024-10-29 14:20:03 +01:00
Yuchen Pei	4b6922a315	MDEV-25008: UPDATE/DELETE: Cost-based choice IN->EXISTS vs Materialization Single-table UPDATE/DELETE didn't provide outer_lookup_keys value for subqueries. This didn't allow to make a meaningful choice between IN->EXISTS and Materialization strategies for subqueries. Fix this: * Make UPDATE/DELETE save Sql_cmd_dml::scanned_rows, * Then, subquery's JOIN::choose_subquery_plan() can fetch it from there for outer_lookup_keys Details: UPDATE/DELETE now calls select_lex->optimize_unflattened_subqueries() twice, like SELECT does (first call optimize_constant_subquries() in JOIN::optimize_inner(), then call optimize_unflattened_subqueries() in JOIN::optimize_stage2()): 1. Call with const_only=true before any optimizations. This allows range optimizer and others to use the values of cheap const subqueries. 2. Call it with const_only=false after range optimizer, partition pruning, etc. outer_lookup_keys value is provided, so it's possible to pick a good subquery strategy. Note: PROTECT_STATEMENT_MEMROOT requires that first SP execution performs subquery optimization for all subqueries, even for degenerate query plans like "Impossible WHERE". Due to that, we ensure that the call to optimize_unflattened_subqueries (with const_only=false) even for degenerate query plans still happens, as was the case before this change.	2024-10-23 23:51:24 +11:00
Sergei Golubchik	3a1cf2c85b	MDEV-34679 ER_BAD_FIELD uses non-localizable substrings	2024-10-17 21:37:37 +02:00
Monty	bddbef3573	MDEV-34533 asan error about stack overflow when writing record in Aria The problem was that when using clang + asan, we do not get a correct value for the thread stack as some local variables are not allocated at the normal stack. It looks like that for example clang 18.1.3, when compiling with -O2 -fsanitize=addressan it puts local variables and things allocated by alloca() in other areas than on the stack. The following code shows the issue Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection (connect=0x5080000027b8, put_in_cache=<optimized out>) at sql/sql_connect.cc:1399 THD thd; 1399 thd->thread_stack= (char) &thd; (gdb) p &thd (THD *) 0x7fffedee7060 (gdb) p $sp (void ) 0x7fffef4e7bc0 The address of thd is 24M away from the stack pointer (gdb) info reg ... rsp 0x7fffef4e7bc0 0x7fffef4e7bc0 ... r13 0x7fffedee7060 140737185214560 r13 is pointing to the address of the thd. Probably some kind of "local stack" used by the sanitizer I have verified this with gdb on a recursive call that calls alloca() in a loop. In this case all objects was stored in a local heap, not on the stack. To solve this issue in a portable way, I have added two functions: my_get_stack_pointer() returns the address of the current stack pointer. The code is using asm instructions for intel 32/64 bit, powerpc, arm 32/64 bit and sparc 32/64 bit. Supported compilers are gcc, clang and MSVC. For MSVC 64 bit we are using _AddressOfReturnAddress() As a fallback for other compilers/arch we use the address of a local variable. my_get_stack_bounds() that will return the address of the base stack and stack size using pthread_attr_getstack() or NtCurrentTed() with fallback to using the address of a local variable and user provided stack size. Server changes are: - Moving setting of thread_stack to THD::store_globals() using my_get_stack_bounds(). - Removing setting of thd->thread_stack, except in functions that allocates a lot on the stack before calling store_globals(). When using estimates for stack start, we reduce stack_size with MY_STACK_SAFE_MARGIN (8192) to take into account the stack used before calling store_globals(). I also added a unittest, stack_allocation-t, to verify the new code. Reviewed-by: Sergei Golubchik <serg@mariadb.org>	2024-10-16 17:24:46 +03:00
Rex	10008b3d3e	MDEV-31466 Add optional correlation column list for derived tables Extend derived table syntax to support column name assignment. (subquery expression) [as\|=] ident [comma separated column name list]. Prior to this patch, the optional comma separated column name list is not supported. Processing within the unit of the subquery expression will use original column names, outside the unit will use the new names. For example, in the query select a1, a2 from (select c1, c2, c3 from t1 where c2 > 0) as dt (a1, a2, a3) where a2 > 10; we see the second column of the derived table dt being used both within, (where c2 > 0), and outside, (where a2 > 10), the specification. Both conditions apply to t1.c2. When multiple unit preparations are required, such as when being used within a prepared statement or procedure, original column names are needed for correct resolution. Original names are reset within mysql_derived_reinit(). Item_holder items, used for result tables in both TVC and union preparations are renamed before use within st_select_lex_unit::prepare(). During wildcard expansion, if column names are present, items names are set directly after creation. Reviewed by Igor Babaev (igor@mariadb.com)	2024-10-15 06:08:46 +12:00
Monty	6f6c1911dc	MDEV-34251 Conditional jump or move depends on uninitialised value in ha_handler_stats::has_stats Fixed by checking handler_stats if it's active instead of thd->variables.log_slow_verbosity & LOG_SLOW_VERBOSITY_ENGINE. Reviewed-by: Sergei Petrunia <sergey@mariadb.com>	2024-10-03 13:45:26 +03:00
Kristian Nielsen	db5d1cde45	MDEV-34857: Implement --slave-abort-blocking-timeout If a slave replicating an event has waited for more than @@slave_abort_blocking_timeout for a conflicting metadata lock held by a non-replication thread, the blocking query is killed to allow replication to proceed and not be blocked indefinitely by a user query. Reviewed-by: Monty <monty@mariadb.org> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-09-04 11:44:14 +02:00
Oleksandr Byelkin	342fa29615	Merge branch '11.4' into 11.5	2024-08-21 11:52:54 +02:00
Oleksandr Byelkin	eb70e0d6e2	Merge branch '11.2' into 11.4	2024-08-21 09:30:54 +02:00
Oleksandr Byelkin	6197e6abc4	Merge branch '10.11' into 11.2	2024-08-21 07:58:46 +02:00
Oleksandr Byelkin	70afc62750	Merge branch '10.6' into 10.11	2024-08-20 10:00:39 +02:00
Oleksandr Byelkin	fc5772ce17	Merge branch '10.5' into 10.6	2024-08-20 09:11:34 +02:00
Dmitry Shulga	ba5482ffc2	MDEV-34718: Trigger doesn't work correctly with bulk update Running an UPDATE statement in PS mode and having positional parameter(s) bound with an array of actual values (that is prepared to be run in bulk mode) results in incorrect behaviour in presence of on update trigger that also executes an UPDATE statement. The same is true for handling a DELETE statement in presence of on delete trigger. Typically, the visible effect of such incorrect behaviour is expressed in a wrong number of updated/deleted rows of a target table. Additionally, in case UPDATE statement, a number of modified rows and a state message returned by a statement contains wrong information about a number of modified rows. The reason for incorrect number of updated/deleted rows is that a data structure used for binding positional argument with its actual values is stored in THD (this is thd->bulk_param) and reused on processing every INSERT/UPDATE/DELETE statement. It leads to consuming actual values bound with top-level UPDATE/DELETE statement by other DML statements used by triggers' body. To fix the issue, reset the thd->bulk_param temporary to the value nullptr before invoking triggers and restore its value on finishing its execution. The second part of the problem relating with wrong value of affected rows reported by Connector/C API is caused by the fact that diagnostics area is reused by an original DML statement and a statement invoked by a trigger. This fact should be take into account on finalizing a state of diagnostics area on completion running of a statement. Important remark: in case the macros DBUG_OFF is on, call of the method Diagnostics_area::reset_diagnostics_area() results in reset of the data members m_affected_rows, m_statement_warn_count. Values of these data members of the class Diagnostics_area are used on sending OK and EOF messages. In case DML statement is executed in PS bulk mode such resetting results in sending wrong result values to a client for affected rows in case the DML statement fires a triggers. So, reset these data members only in case the current statement being processed is not run in bulk mode.	2024-08-19 12:13:43 +07:00
Oleksandr Byelkin	ea75a0b600	Merge branch '11.4' into 11.5	2024-08-05 17:50:18 +02:00
Oleksandr Byelkin	1640c9b06e	Merge branch '11.2' into 11.4	2024-08-04 17:27:48 +02:00
Oleksandr Byelkin	dced6cbdb6	Merge branch '11.1' into 11.2	2024-08-03 09:50:16 +02:00
Oleksandr Byelkin	80abd847da	Merge branch '10.11' into 11.1	2024-08-03 09:32:42 +02:00

1 2 3 4 5 ...

4394 commits