mariadb

mirror of https://github.com/MariaDB/server.git synced 2026-05-15 19:37:16 +02:00

Author	SHA1	Message	Date
Oleksandr Byelkin	1d0e94c55f	Merge branch '10.5' into 10.6	2024-10-09 08:38:48 +02:00
Thirunarayanan Balathandayuthapani	23820f1d79	MDEV-34392 Inplace algorithm violates the foreign key constraint - Fixing the compilation issue for the compiler lesser than gcc-6 Reviewed-by : Marko Mäkelä <marko.makela@mariadb.com>	2024-10-09 10:14:29 +05:30
Thirunarayanan Balathandayuthapani	65418ca9ad	MDEV-34392 Inplace algorithm violates the foreign key constraint - Fix the compilation error in gcc-5	2024-10-08 16:43:57 +05:30
Marko Mäkelä	7e0afb1c73	Merge 10.5 into 10.6	2024-10-03 09:31:39 +03:00
Sergei Golubchik	b1bbdbab9e	cleanup: remove redundant if() likely, a result of auto-merge of two fixes in different versions	2024-10-01 18:29:11 +02:00
Thirunarayanan Balathandayuthapani	cc810e64d4	MDEV-34392 Inplace algorithm violates the foreign key constraint Don't allow the referencing key column from NULL TO NOT NULL when 1) Foreign key constraint type is ON UPDATE SET NULL 2) Foreign key constraint type is ON DELETE SET NULL 3) Foreign key constraint type is UPDATE CASCADE and referenced column declared as NULL Don't allow the referenced key column from NOT NULL to NULL when foreign key constraint type is UPDATE CASCADE and referencing key columns doesn't allow NULL values get_foreign_key_info(): InnoDB sends the information about nullability of the foreign key fields and referenced key fields. fk_check_column_changes(): Enforce the above rules for COPY algorithm innobase_check_foreign_drop_col(): Checks whether the dropped column exists in existing foreign key relation innobase_check_foreign_low() : Enforce the above rules for INPLACE algorithm dict_foreign_t::check_fk_constraint_valid(): This is used by CREATE TABLE statement to check nullability for foreign key relation.	2024-10-01 09:41:56 +05:30
Max Kellermann	45298b730b	sql/handler: referenced_by_foreign_key() returns bool The method was declared to return an unsigned integer, but it is really a boolean (and used as such by all callers). A secondary change is the addition of "const" and "noexcept" to this method. In ha_mroonga.cpp, I also added "inline" to the two helper methods of referenced_by_foreign_key(). This allows the compiler to flatten the method.	2024-09-30 16:33:25 +03:00
Marko Mäkelä	b7b2d2bde4	Merge 10.5 into 10.6	2024-09-09 11:30:30 +03:00
Marko Mäkelä	024a18dbcb	MDEV-34823 Invalid arguments in ib_push_warning() In the bug report MDEV-32817 it occurred that the function row_mysql_get_table_status() is outputting a fil_space_t* as if it were a numeric tablespace identifier. ib_push_warning(): Remove. Let us invoke push_warning_printf() directly. innodb_decryption_failed(): Report a decryption failure and set the dict_table_t::file_unreadable flag. This code was being duplicated in very many places. We return the constant value DB_DECRYPTION_FAILED in order to avoid code duplication in the callers and to allow tail calls. innodb_fk_error(): Report a FOREIGN KEY error. dict_foreign_def_get(), dict_foreign_def_get_fields(): Remove. This code was being used in dict_create_add_foreign_to_dictionary() in an apparently uncovered code path. That ib_push_warning() call would pass the integer i+1 instead of a pointer to NUL terminated string ("%s"), and therefore the call should have resulted in a crash. dict_print_info_on_foreign_key_in_create_format(), innobase_quote_identifier(): Add const qualifiers. row_mysql_get_table_error(): Replaces row_mysql_get_table_status(). Display no message on DB_CORRUPTION; it should be properly reported at the SQL layer anyway.	2024-09-06 14:29:09 +03:00
Marko Mäkelä	bda40ccb85	MDEV-34803 innodb_lru_flush_size is no longer used In commit `fa8a46eb68` (MDEV-33613) the parameter innodb_lru_flush_size ceased to have any effect. Let us declare the parameter as deprecated and additionally as MARIADB_REMOVED_OPTION, so that there will be a warning written to the error log in case the option is specified in the command line. Let us also do the same for the parameter innodb_purge_rseg_truncate_frequency that was deprecated&ignored earlier in MDEV-32050. Reviewed by: Debarun Banerjee	2024-08-28 07:18:03 +03:00
Marko Mäkelä	b7b9f3ce82	MDEV-34515: Contention between purge and workload In a Sysbench oltp_update_index workload that involves 1 table, a serious contention between the workload and the purge of history was observed. This was the worst when the table contained only 1 record. This turned out to be fixed by setting innodb_purge_batch_size=128, which corresponds to the number of usable persistent rollback segments. When we go above that, there would be contention between row_purge_poss_sec() and the workload, typically on the clustered index page latch, sometimes also on a secondary index page latch. It might be that with smaller batches, trx_sys.history_size() will end up pausing all concurrent transaction start/commit frequently enough so that purge will be able to make some progress, so that there would be less contention on the index page latches between purge and SQL execution. In commit `aa719b5010` (part of MDEV-32050) the interpretation of the parameter innodb_purge_batch_size was slightly changed. It would correspond to the maximum desired size of the purge_sys.pages cache. Before that change, the parameter was referring to a number of undo log pages, but the accounting might have been inaccurate. To avoid a regression, we will reduce the default value to innodb_purge_batch_size=127, which will also be compatible with innodb_undo_tablespaces>1 (which will disable rollback segment 0). Additionally, some logic in the purge and MVCC checks is simplified. The purge tasks will make use of purge_sys.pages when accessing undo log pages to find out if a secondary index record can be removed. If an undo page needs to be looked up in buf_pool.page_hash, we will merely buffer-fix it. This is correct, because the undo pages are append-only in nature. Holding purge_sys.latch or purge_sys.end_latch or the fact that the current thread is executing as a part of an in-progress purge batch will prevent the contents of the undo page from being freed and subsequently reused. The buffer-fix will prevent the page from being evicted form the buffer pool. Thanks to this logic, we can refer to the undo log record directly in the buffer pool page and avoid copying the record. buf_pool_t::page_fix(): Look up and buffer-fix a page. This is useful for accessing undo log pages, which are append-only by nature. There will be no need to deal with change buffer or ROW_FORMAT=COMPRESSED in that case. purge_sys_t::view_guard::view_guard(): Allow the type of guard to be acquired: end_latch, latch, or no latch (in case we are a purge thread). purge_sys_t::view_guard::get(): Read-only accessor to purge_sys.pages. purge_sys_t::get_page(): Invoke buf_pool_t::page_fix(). row_vers_old_has_index_entry(): Replaced with row_purge_is_unsafe() and row_undo_mod_sec_unsafe(). trx_undo_get_undo_rec(): Merged to trx_undo_prev_version_build(). row_purge_poss_sec(): Add the parameter mtr and remove redundant or unused parameters sec_pcur, sec_mtr, is_tree. We will use the caller's mtr object but release any acquired page latches before returning. btr_cur_get_page(), page_cur_get_page(): Do not invoke page_align(). row_purge_remove_sec_if_poss_leaf(): Return the value of PAGE_MAX_TRX_ID to be checked against the page in row_purge_remove_sec_if_poss_tree(). If the secondary index page was not changed meanwhile, it will be unnecessary to invoke row_purge_poss_sec() again. trx_undo_prev_version_build(): Access any undo log pages using the caller's mini-transaction object. row_purge_vc_matches_cluster(): Moved to the only compilation unit that needs it. Reviewed by: Debarun Banerjee	2024-08-26 12:23:06 +03:00
Marko Mäkelä	757c368139	Merge 10.5 into 10.6	2024-08-14 10:56:11 +03:00
Jan Lindström	cd8b8bb964	MDEV-34594 : Assertion `client_state.transaction().active()' failed in int wsrep_thd_append_key(THD, const wsrep_key, int, Wsrep_service_key_type) CREATE TABLE [SELECT\|REPLACE SELECT] is CTAS and idea was that we force ROW format. However, it was not correctly enforced and keys were appended before wsrep transaction was started. At THD::decide_logging_format we should force used stmt binlog format to ROW in CTAS case and produce a warning if used binlog format was not ROW. At ha_innodb::update_row we should not append keys similarly as in ha_innodb::write_row if sql_command is SQLCOM_CREATE_TABLE. Improved error logging on ::write_row, ::update_row and ::delete_row if wsrep key append fails. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-08-12 23:54:30 +02:00
Oleksandr Byelkin	8f020508c8	Merge branch '10.5' into 10.6	2024-08-03 09:04:24 +02:00
Thirunarayanan Balathandayuthapani	533e6d5d13	MDEV-34670 IMPORT TABLESPACE unnecessary traverses tablespace list Problem: ======== - After the commit `ada1074bb1` (MDEV-14398) fil_crypt_set_encrypt_tables() iterates through all tablespaces to fill the default_encrypt tables list. This was a trigger to encrypt or decrypt when key rotation age is set to 0. But import tablespace does call fil_crypt_set_encrypt_tables() unnecessarily. The motivation for the call is to signal the encryption threads. Fix: ==== ha_innobase::discard_or_import_tablespace: Remove the fil_crypt_set_encrypt_tables() and add the import tablespace to the default encrypt list if necessary	2024-07-31 14:13:38 +05:30
Monty	4bf7c966b3	MDEV-34664: Add an option to fix InnoDB's doubling of secondary index cardinalities (With trivial fixes by sergey@mariadb.com) Added option fix_innodb_cardinality to optimizer_adjust_secondary_key_costs Using fix_innodb_cardinality disables the 'divide by 2' of rec_per_key_int in InnoDB that in effect doubles the Cardinality for secondary keys. This has the biggest effect for indexes where a few rows has the same key value. Using this may also cause table scans for very small tables (which in some cases may be better than an index scan). The user visible effect is that 'SHOW INDEX FROM table_name' will for InnoDB show the true Cardinality (and not 2x the real value). It will also allow the optimizer to chose a better index in some cases as the division by 2 could have a bad effect for tables with 2-5 identical values per key. A few notes about using fix_innodb_cardinality: - It has direct affect for SHOW INDEX FROM table_name. SHOW INDEX will also update the statistics in table share. - The effect of fix_innodb_cardinality for query plans or EXPLAIN is only visible after first open of the table. This is why one must do a flush tables or use SHOW INDEX for the option to take effect. - Using fix_innodb_cardinality can thus affect all user in their query plans if they are using the same tables. Because of this, it is strongly recommended that one uses optimizer_adjust_secondary_key_costs=fix_innodb_cardinality mainly in configuration files to not cause issues for other users.	2024-07-29 16:40:53 +03:00
Alexander Barkov	e56040fee8	Merge remote-tracking branch 'origin/10.5' into 10.6	2024-07-08 18:59:04 +04:00
Thirunarayanan Balathandayuthapani	834c013b64	MDEV-34519 innodb_log_checkpoint_now crashes when innodb_read_only is enabled During read only mode, InnoDB doesn't allow checkpoint to happen. So InnoDB should throw the warning when InnoDB tries to force the checkpoint when innodb_read_only = 1 or innodb_force_recovery = 6.	2024-07-05 15:26:05 +05:30
Denis Protivensky	cfbd57dfb7	MDEV-33064: Sync trx->wsrep state from THD on trx start InnoDB transactions may be reused after committed: - when taken from the transaction pool - during a DDL operation execution In this case wsrep flag on trx object is cleared, which may cause wrong execution logic afterwards (wsrep-related hooks are not run). Make trx->wsrep flag initialize from THD object only once on InnoDB transaction start and don't change it throughout the transaction's lifetime. The flag is reset at commit time as before. Unconditionally set wsrep=OFF for THD objects that represent InnoDB background threads. Make Wsrep_schema::store_view() operate in its own transaction. Fix streaming replication transactions' fragments rollback to not switch THD->wsrep value during transaction's execution (use THD->wsrep_ignore_table as a workaround). Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-07-01 13:07:39 +02:00
Marko Mäkelä	d1ecf5cc5f	MDEV-32176 Contention in ha_innobase::info_low() During a Sysbench oltp_point_select workload with 1 table and 400 concurrent connections, a bottleneck on dict_table_t::lock_mutex was observed in ha_innobase::info_low(). dict_table_t::lock_latch: Replaces lock_mutex. In ha_innobase::info_low() and several other places, we will acquire a shared dict_table_t::lock_latch or we may elide the latch if hardware memory transactions are available. innobase_build_v_templ(): Remove the parameter "bool locked", and require the caller to hold exclusive dict_table_t::lock_latch (instead of holding an exclusive dict_sys.latch). Tested by: Vladislav Vaintroub Reviewed by: Vladislav Vaintroub	2024-06-28 15:57:07 +03:00
Marko Mäkelä	0076eb3d4e	Merge 10.5 into 10.6	2024-06-24 13:09:47 +03:00
Marko Mäkelä	acc077ffa1	MDEV-34443 ha_innobase::info_low() does not distinguish HA_STATUS_VARIABLE_EXTRA ha_innobase::info_low(): For HA_STATUS_VARIABLE without HA_STATUS_VARIABLE_EXTRA, let us avoid unnecessary and costly updates of the data_free statistics, which are only needed for SHOW TABLE STATUS. This optimization had been enabled in commit `247ecb7597` but not utilized until now.	2024-06-24 10:39:13 +03:00
Jan Lindström	ee974ca5e0	MDEV-31658 : Deadlock found when trying to get lock during applying Problem was that there was two non-conflicting local idle transactions in node_1 that both inserted a key to primary key. Then two transactions from other nodes inserted also a key to primary key so that insert from node_2 conflicted one of the local transactions in node_1 so that there would be duplicate key if both are committed. For this insert from other node tries to acquire S-lock for this record and because this insert is high priority brute force (BF) transaction it will kill idle local transaction. Concurrently, second insert from node_3 conflicts the second idle insert transaction in node_1. Again, it tries to acquire S-lock for this record and kills idle local transaction. At this point we have two non-conflicting high priority transactions holding S-lock on different records in node_1. For example like this: rec s-lock-node2-rec s-lock-node3-rec rec. Because these high priority BF-transactions do not wait each other insert from node3 that has later seqno compared to insert from node2 can continue. It will try to acquire insert intention for record it tries to insert (to avoid duplicate key to be inserted by local transaction). Hower, it will note that there is conflicting S-lock in same gap between records. This will lead deadlock error as we have defined that BF-transactions may not wait for record lock but we can't kill conflicting BF-transaction because it has lower seqno and it should commit first. BF-transactions are executed concurrently because their values to primary key are different i.e. they do not conflict. Galera certification will make sure that inserts from other nodes i.e these high priority BF-transactions can't insert duplicate keys. Local transactions naturally can but they will be killed when BF-transaction acquires required record locks. Therefore, we can allow situation where there is conflicting S-lock and insert intention lock regardless of their seqno order and let both continue with no wait. This will lead to situation where we need to allow BF-transaction to wait when lock_rec_has_to_wait_in_queue is called because this function is also called from lock_rec_queue_validate and because lock is waiting there would be assertion in ut_a(lock->is_gap() \|\| lock_rec_has_to_wait_in_queue(cell, lock)); lock_wait_wsrep_kill Add debug sync points for BF-transactions killing local transaction. wsrep_assert_no_bf_bf_wait Print also requested lock information lock_rec_has_to_wait Add function to handle wsrep transaction lock wait cases. lock_rec_has_to_wait_wsrep New function to handle wsrep transaction lock wait exceptions. lock_rec_has_to_wait_in_queue Remove wsrep exception, in this function all conflicting locks need to wait in queue. Conflicts between BF and local transactions are handled in lock_wait. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-06-19 14:09:11 +02:00
Marko Mäkelä	27834ebc91	Merge 10.5 into 10.6	2024-06-10 15:22:15 +03:00
Thirunarayanan Balathandayuthapani	b7a75fbb8a	MDEV-34169 Don't allow innodb_open_files to be lesser than number of non-user tablespace. - InnoDB only closes the user tablespace when the number of open files exceeds innodb_open_files limit. In that case, InnoDB should make sure that innodb_open_files value should be greater than number of undo tablespace, system and temporary tablespace files.	2024-06-07 15:37:11 +05:30
Marko Mäkelä	699d38d951	MDEV-34296 extern thread_local is a CPU waste In commit 99bd22605938c42d876194f2ec75b32e658f00f5 (MDEV-31558) we wrongly thought that there would be minimal overhead for accessing a thread-local variable mariadb_stats. It turns out that in C++11, each access to an extern thread_local variable requires conditionally invoking an initialization function. In fact, the initializer expression of mariadb_stats is dynamic, and those calls were actually unavoidable. In C++20, one could declare constinit thread_local variables, but the address of a thread_local variable (&mariadb_dummy_stats) is not a compile-time constant. We did not want to declare mariadb_dummy_stats without thread_local, because then the dummy accesses could lead to cache line contention between threads. mariadb_stats: Declare as __thread or __declspec(thread) so that there will be no dynamic initialization, but zero-initialization. mariadb_dummy_stats: Remove. It is a lesser evil to let the environment perform zero-initialization and check if !mariadb_stats. Reviewed by: Sergei Petrunia	2024-06-06 14:38:42 +03:00
Thirunarayanan Balathandayuthapani	58a0e1e3dd	MDEV-34223 Innodb - add status variable for number of bulk inserts - Added a counter innodb_num_bulk_insert_operation in INFORMATION_SCHEMA.GLOBAL_STATUS. This counter is incremented whenever a InnoDB undergoes bulk insert operation. - Change the innodb_instant_alter_column to atomic variable.	2024-06-03 16:27:22 +05:30
Vladislav Vaintroub	736449d30f	MDEV-34205: ASAN stack buffer overflow in strxnmov() in frm_file_exists Correct the second parameter for strxnmov to prevent potential buffer overflows. The second parameter must be one less than the size of the input buffer to avoid writing past the end of the buffer. While the second parameter is usually correct, there are exceptions that need fixing. This commit addresses the issue within frm_file_exists() and other affected places.	2024-05-23 22:08:27 +02:00
Thirunarayanan Balathandayuthapani	8c8b7da017	MDEV-33979 Disallow bulk insert operation during partition update statement Problem: ======== - Partition update operation enables the bulk insert for the transaction while moving the row between partitions. This leads to debug assert failure while removing the row from one of the partition. Solution: ======== - Disallow the bulk insert operation for non-insert operation of partition table.	2024-04-25 10:50:34 +05:30
Marko Mäkelä	e459ce8336	MDEV-33779 InnoDB row operations could be faster We have quite a few assertions ut_a(m_prebuilt->trx == thd_to_trx(ha_thd())); in low-level functions. These had better be debug assertions for performance reasons. It should suffice to check that condition in the less frequently invoked ha_innobase::change_active_index(). convert_search_mode_to_innobase(): Return whether the mode is unsupported, and optionally update ha_innobase::m_last_match_mode. ha_innobase::index_read(): Only branch on find_flag once, and simplify the error handling after invoking row_search_mvcc(). ha_innobase::rnd_pos(): Remove an assertion that is duplicating one in ha_innobase::index_read(), which we are calling unconditionally. ha_innobase::records_in_range(): Check only once whether min_key, max_key are null pointers. row_sel_convert_mysql_key_to_innobase(): Declare all parameters except the conversion buffer pointer (buf) to be nonnull. Reviewed by: Debarun Banerjee	2024-04-17 16:47:41 +03:00
Marko Mäkelä	829cb1a49c	Merge 10.5 into 10.6	2024-04-17 14:14:58 +03:00
Kristian Nielsen	16aa4b5f59	Merge from 10.4 to 10.5 Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-04-15 17:46:49 +02:00
Sergei Golubchik	41296a07c8	Merge branch '10.5' into 10.6	2024-04-11 13:58:22 +02:00
sjaakola	2fcf2ec229	MDEV-33749 hyphen in table name can cause galera certification failures Fix in this commit handles foreign key value appending into write set so that db and table names are converted from the filepath format to tablename format. This is compatible with key values appended from elsewhere in the code base There is a mtr test galera.galera_table_with_hyphen for regression testing Reviewer: monty@mariadb.com	2024-04-04 17:12:09 +03:00
Marko Mäkelä	b8a6719889	MDEV-26642/MDEV-26643/MDEV-32898 Implement innodb_snapshot_isolation https://jepsen.io/analyses/mysql-8.0.34 highlights that the transaction isolation levels in the InnoDB storage engine do not correspond to any widely accepted definitions, such as "Generalized Isolation Level Definitions" https://pmg.csail.mit.edu/papers/icde00.pdf (PL-1 = READ UNCOMMITTED, PL-2 = READ COMMITTED, PL-2.99 = REPEATABLE READ, PL-3 = SERIALIZABLE). Only READ UNCOMMITTED in InnoDB seems to match the above definition. The issue is that InnoDB does not detect write/write conflicts (Section 4.4.3, Definition 6) in the above. It appears that as soon as we implement write/write conflict detection (SET SESSION innodb_snapshot_isolation=ON), the default isolation level (SET TRANSACTION ISOLATION LEVEL REPEATABLE READ) will become Snapshot Isolation (similar to Postgres), as defined in Section 4.2 of "A Critique of ANSI SQL Isolation Levels", MSR-TR-95-51, June 1995 https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-95-51.pdf Locking reads inside InnoDB used to read the latest committed version, ignoring what should actually be visible to the transaction. The added test innodb.lock_isolation illustrates this. The statement UPDATE t SET a=3 WHERE b=2; is executed in a transaction that was started before a read view or a snapshot of the current transaction was created, and committed before the current transaction attempts to execute UPDATE t SET b=3; If SET innodb_snapshot_isolation=ON is in effect when the second transaction was started, the second transaction will be aborted with the error ER_CHECKREAD. By default (innodb_snapshot_isolation=OFF), the second transaction would execute inconsistently, displaying an incorrect SELECT COUNT(*) FROM t in its read view. If innodb_snapshot_isolation=ON, if an attempt to acquire a lock on a record that does not exist in the current read view is made, an error DB_RECORD_CHANGED (HA_ERR_RECORD_CHANGED, ER_CHECKREAD) will be raised. This error will be treated in the same way as a deadlock: the transaction will be rolled back. lock_clust_rec_read_check_and_lock(): If the current transaction has a read view where the record is not visible and innodb_snapshot_isolation=ON, fail before trying to acquire the lock. row_sel_build_committed_vers_for_mysql(): If innodb_snapshot_isolation=ON, disable the "semi-consistent read" logic that had been implemented by myself on the directions of Heikki Tuuri in order to address https://bugs.mysql.com/bug.php?id=3300 that was motivated by a customer wanting UPDATE to skip locked rows that do not match the WHERE condition. It looks like my changes were included in the MySQL 5.1.5 commit `ad126d90e0`; at that time, employees of Innobase Oy (a recent acquisition of Oracle) had lost write access to the repository. The only reason why we set innodb_snapshot_isolation=OFF by default is backward compatibility with applications, such as the one that motivated the implementation of "semi-consistent read" back in 2005. In a later major release, we can default to innodb_snapshot_isolation=ON. Thanks to Peter Alvaro, Kyle Kingsbury and Alexey Gotsman for their work on https://github.com/jepsen-io/ and to Kyle and Alexey for explanations and some testing of this fix. Thanks to Vladislav Lesin for the initial test for MDEV-26643, as well as reviewing these changes.	2024-03-20 09:48:03 +02:00
Marko Mäkelä	50715bd2ed	Merge 10.5 into 10.6	2024-03-18 17:07:32 +02:00
Marko Mäkelä	09d991d01c	MDEV-33478: Tests massively fail with clang-18 -fsanitize=memory Starting with clang-16, MemorySanitizer appears to check that uninitialized values not be passed by value nor returned. Previously, it was allowed to copy uninitialized data in such cases. get_foreign_key_info(): Remove a local variable that was passed uninitialized to a function. DsMrr_impl: Initialize key_buffer, because DsMrr_impl::dsmrr_init() is reading it. test_bind_result_ext1(): MYSQL_TYPE_LONG is 32 bits, hence we must use a 32-bit type, such as int. sizeof(long) differs between LP64 and LLP64 targets.	2024-03-18 16:01:29 +02:00
mariadb-DebarunBanerjee	d912a6369c	MDEV-31154 Fatal InnoDB error or assertion `!is_v' failure upon multi-update with indexed virtual column MDEV-33558 Fatal error InnoDB: Clustered record field for column x not found This is issue is about row ID filtering used with index on virtual column(s). We hit debug assert and crash while building the record template in Innodb. The primary reason is that we try to force the code path to use the ICP path. With ICP, we don't support index with virtual column and we validate it while index condition is pushed. Simplify the code for building template to handle both ICP and Row ID filtering by skipping virtual columns.	2024-03-15 19:29:46 +05:30
Marko Mäkelä	c3a00dfa53	Merge 10.5 into 10.6	2024-03-12 09:19:57 +02:00
mariadb-DebarunBanerjee	afe9632913	MDEV-33593 Auto increment deadlock error causes ASSERT in subsequent save point The issue here is ha_innobase::get_auto_increment() could cause a deadlock involving auto-increment lock and rollback the transaction implicitly. For such cases, storage engines usually call thd_mark_transaction_to_rollback() to inform SQL engine about it which in turn takes appropriate actions and close the transaction. In innodb, we call it while converting Innodb error code to MySQL. However, since ::innobase_get_autoinc() returns void, we skip the call for error code conversion and also miss marking the transaction for rollback for deadlock error. We assert eventually while releasing a savepoint as the transaction state is not active. Since convert_error_code_to_mysql() is handling some generic error handling part, like invoking the callback when needed, we should call that function in ha_innobase::get_auto_increment() even if we don't return the resulting mysql error code back.	2024-03-07 21:54:06 +05:30
Marko Mäkelä	8ec12e0d6d	Merge 10.4 into 10.5	2024-02-12 11:38:13 +02:00
Marko Mäkelä	466069b184	Merge 10.5 into 10.6	2024-02-08 10:38:53 +02:00
Marko Mäkelä	0381921e26	MDEV-33277 In-place upgrade causes invalid AUTO_INCREMENT values MDEV-33308 CHECK TABLE is modifying .frm file even if --read-only As noted in commit `d0ef1aaf61`, MySQL as well as older versions of MariaDB server would during ALTER TABLE ... IMPORT TABLESPACE write bogus values to the PAGE_MAX_TRX_ID field to pages of the clustered index, instead of letting that field remain 0. In commit `8777458a6e` this field was repurposed for PAGE_ROOT_AUTO_INC in the clustered index root page. To avoid trouble when upgrading from MySQL or older versions of MariaDB, we will try to detect and correct bogus values of PAGE_ROOT_AUTO_INC when opening a table for the first time from the SQL layer. btr_read_autoinc_with_fallback(): Add the parameters to mysql_version,max to indicate the TABLE_SHARE::mysql_version of the .frm file and the maximum value allowed for the type of the AUTO_INCREMENT column. In case the table was originally created in MySQL or an older version of MariaDB, read also the maximum value of the AUTO_INCREMENT column from the table and reset the PAGE_ROOT_AUTO_INC if it is above the limit. dict_table_t::get_index(const dict_col_t &) const: Find an index that starts with the specified column. ha_innobase::check_for_upgrade(): Return HA_ADMIN_FAILED if InnoDB needs upgrading but is in read-only mode. In this way, the call to update_frm_version() will be skipped. row_import_autoinc(): Adjust the AUTO_INCREMENT column at the end of ALTER TABLE...IMPORT TABLESPACE. This refinement was suggested by Debarun Banerjee. The changes outside InnoDB were developed by Michael 'Monty' Widenius: Added print_check_msg() service for easy reporting of check/repair messages in ENGINE=Aria and ENGINE=InnoDB. Fixed that CHECK TABLE do not update the .frm file under --read-only. Added 'handler_flags' to HA_CHECK_OPT as a way for storage engines to store state from handler::check_for_upgrade(). Reviewed by: Debarun Banerjee	2024-02-08 10:35:45 +02:00
Marko Mäkelä	b2654ba826	MDEV-32899 InnoDB is holding shared dict_sys.latch while waiting for FOREIGN KEY child table lock on DDL lock_table_children(): A new function to lock all child tables of a table. We will only hold dict_sys.latch while traversing dict_table_t::referenced_set. To prevent a race condition with std::set::erase() we will copy the pointers to the child tables to a local vector. Once we have acquired MDL and references to all child tables, we can safely release dict_sys.latch, wait for the locks, and finally release the references. dict_acquire_mdl_shared(): A new variant that takes mdl_context as a parameter. lock_table_for_trx(): Assert that we are not holding dict_sys.latch. ha_innobase::truncate(): When foreign_key_checks=ON, assert that no child tables exist (other than the current table). In any case, we will invoke lock_table_children() so that the child table metadata can be safely updated. (It is possible that a child table is being created concurrently with TRUNCATE TABLE.) ha_innobase::delete_table(): Before and after acquiring exclusive locks on the current table as well as all child tables, check that FOREIGN KEY constraints will not be violated. In this way, we can reject impossible DROP TABLE without having to wait for locks first. This fixes up commit `2ca1123464` (MDEV-26217) and commit `c3c53926c4` (MDEV-26554).	2024-02-08 14:22:35 +11:00
Marko Mäkelä	8d54d173d7	Cleanup: Remove ut_format_name() This follows up commit `383f77cd84` which simplified dict_table_schema_check(). Note: We can display quoted names like this: my_snprintf(buf, sizeof buf, "%`.*s.%`s", int(t->name.dblen()), t->name.m_name, t->name.basename());	2024-02-07 13:56:31 +02:00
Marko Mäkelä	91a2192bf2	Merge 10.5 into 10.6	2024-02-07 13:51:03 +02:00
Thirunarayanan Balathandayuthapani	21f18bd9d7	MDEV-33341 innodb.undo_space_dblwr test case fails with Unknown Storage Engine InnoDB Reason: ====== undo_space_dblwr test case fails if the first page of undo tablespace is not flushed before restart the server. While restarting the server, InnoDB fails to detect the first page of undo tablespace from doublewrite buffer. Fix: === Use "ib_log_checkpoint_avoid_hard" debug sync point to avoid checkpoint and make sure to flush the dirtied page before killing the server. innodb_make_page_dirty(): Fails to set srv_fil_make_page_dirty_debug variable.	2024-01-31 15:55:09 +05:30
Marko Mäkelä	b7d1f65b81	MDEV-12266 fixup: Remove dead code Ever since commit `5e84ea9634` this "else if" branch was unreachable because the preceding "if" condition covered it.	2024-01-30 13:10:53 +02:00
Marko Mäkelä	21560bee9d	Revert "MDEV-32899 InnoDB is holding shared dict_sys.latch while waiting for FOREIGN KEY child table lock on DDL" This reverts commit `569da6a7ba`, commit `768a736174`, and commit `ba6bf7ad9e` because of a regression that was filed as MDEV-33104.	2024-01-19 12:46:11 +02:00
Marko Mäkelä	a6290a5bc5	MDEV-33095 innodb_flush_method=O_DIRECT creates excessive errors on Solaris The directio(3C) function on Solaris is supported on NFS and UFS while the majority of users should be on ZFS, which is a copy-on-write file system that implements transparent compression and therefore cannot support unbuffered I/O. Let us remove the call to directio() and simply treat innodb_flush_method=O_DIRECT in the same way as the previous default value innodb_flush_method=fsync on Solaris. Also, let us remove some dead code around calls to os_file_set_nocache() on platforms where fcntl(2) is not usable with O_DIRECT. On IBM AIX, O_DIRECT is not documented for fcntl(2), only for open(2).	2024-01-19 15:34:33 +11:00

1 2 3 4 5 ...

3,061 commits