mariadb

mirror of https://github.com/MariaDB/server.git synced 2026-05-16 11:57:38 +02:00

Author	SHA1	Message	Date
Marko Mäkelä	63913ce5af	Merge 10.6 into 10.11	2024-10-03 10:55:08 +03:00
Marko Mäkelä	7e0afb1c73	Merge 10.5 into 10.6	2024-10-03 09:31:39 +03:00
Sergei Golubchik	b1bbdbab9e	cleanup: remove redundant if() likely, a result of auto-merge of two fixes in different versions	2024-10-01 18:29:11 +02:00
Thirunarayanan Balathandayuthapani	cc810e64d4	MDEV-34392 Inplace algorithm violates the foreign key constraint Don't allow the referencing key column from NULL TO NOT NULL when 1) Foreign key constraint type is ON UPDATE SET NULL 2) Foreign key constraint type is ON DELETE SET NULL 3) Foreign key constraint type is UPDATE CASCADE and referenced column declared as NULL Don't allow the referenced key column from NOT NULL to NULL when foreign key constraint type is UPDATE CASCADE and referencing key columns doesn't allow NULL values get_foreign_key_info(): InnoDB sends the information about nullability of the foreign key fields and referenced key fields. fk_check_column_changes(): Enforce the above rules for COPY algorithm innobase_check_foreign_drop_col(): Checks whether the dropped column exists in existing foreign key relation innobase_check_foreign_low() : Enforce the above rules for INPLACE algorithm dict_foreign_t::check_fk_constraint_valid(): This is used by CREATE TABLE statement to check nullability for foreign key relation.	2024-10-01 09:41:56 +05:30
Max Kellermann	45298b730b	sql/handler: referenced_by_foreign_key() returns bool The method was declared to return an unsigned integer, but it is really a boolean (and used as such by all callers). A secondary change is the addition of "const" and "noexcept" to this method. In ha_mroonga.cpp, I also added "inline" to the two helper methods of referenced_by_foreign_key(). This allows the compiler to flatten the method.	2024-09-30 16:33:25 +03:00
Marko Mäkelä	6acada713a	MDEV-34062: Implement innodb_log_file_mmap on 64-bit systems When using the default innodb_log_buffer_size=2m, mariadb-backup --backup would spend a lot of time re-reading and re-parsing the log. For reads, it would be beneficial to memory-map the entire ib_logfile0 to the address space (typically 48 bits or 256 TiB) and read it from there, both during --backup and --prepare. We will introduce the Boolean read-only parameter innodb_log_file_mmap that will be OFF by default on most platforms, to avoid aggressive read-ahead of the entire ib_logfile0 in when only a tiny portion would be accessed. On Linux and FreeBSD the default is innodb_log_file_mmap=ON, because those platforms define a specific mmap(2) option for enabling such read-ahead and therefore it can be assumed that the default would be on-demand paging. This parameter will only have impact on the initial InnoDB startup and recovery. Any writes to the log will use regular I/O, except when the ib_logfile0 is stored in a specially configured file system that is backed by persistent memory (Linux "mount -o dax"). We also experimented with allowing writes of the ib_logfile0 via a memory mapping and decided against it. A fundamental problem would be unnecessary read-before-write in case of a major page fault, that is, when a new, not yet cached, virtual memory page in the circular ib_logfile0 is being written to. There appears to be no way to tell the operating system that we do not care about the previous contents of the page, or that the page fault handler should just zero it out. Many references to HAVE_PMEM have been replaced with references to HAVE_INNODB_MMAP. The predicate log_sys.is_pmem() has been replaced with log_sys.is_mmap() && !log_sys.is_opened(). Memory-mapped regular files differ from MAP_SYNC (PMEM) mappings in the way that an open file handle to ib_logfile0 will be retained. In both code paths, log_sys.is_mmap() will hold. Holding a file handle open will allow log_t::clear_mmap() to disable the interface with fewer operations. It should be noted that ever since commit `685d958e38` (MDEV-14425) most 64-bit Linux platforms on our CI platforms (s390x a.k.a. IBM System Z being a notable exception) read and write /dev/shm/*/ib_logfile0 via a memory mapping, pretending that it is persistent memory (mount -o dax). So, the memory mapping based log parsing that this change is enabling by default on Linux and FreeBSD has already been extensively tested on Linux. ::log_mmap(): If a log cannot be opened as PMEM and the desired access is read-only, try to open a read-only memory mapping. xtrabackup_copy_mmap_snippet(), xtrabackup_copy_mmap_logfile(): Copy the InnoDB log in mariadb-backup --backup from a memory mapped file.	2024-09-26 18:47:12 +03:00
Yuchen Pei	b168859d1e	Merge branch '10.6' into 10.11	2024-09-11 16:10:53 +10:00
Marko Mäkelä	b7b2d2bde4	Merge 10.5 into 10.6	2024-09-09 11:30:30 +03:00
Marko Mäkelä	024a18dbcb	MDEV-34823 Invalid arguments in ib_push_warning() In the bug report MDEV-32817 it occurred that the function row_mysql_get_table_status() is outputting a fil_space_t* as if it were a numeric tablespace identifier. ib_push_warning(): Remove. Let us invoke push_warning_printf() directly. innodb_decryption_failed(): Report a decryption failure and set the dict_table_t::file_unreadable flag. This code was being duplicated in very many places. We return the constant value DB_DECRYPTION_FAILED in order to avoid code duplication in the callers and to allow tail calls. innodb_fk_error(): Report a FOREIGN KEY error. dict_foreign_def_get(), dict_foreign_def_get_fields(): Remove. This code was being used in dict_create_add_foreign_to_dictionary() in an apparently uncovered code path. That ib_push_warning() call would pass the integer i+1 instead of a pointer to NUL terminated string ("%s"), and therefore the call should have resulted in a crash. dict_print_info_on_foreign_key_in_create_format(), innobase_quote_identifier(): Add const qualifiers. row_mysql_get_table_error(): Replaces row_mysql_get_table_status(). Display no message on DB_CORRUPTION; it should be properly reported at the SQL layer anyway.	2024-09-06 14:29:09 +03:00
Marko Mäkelä	984606d747	MDEV-34750 SET GLOBAL innodb_log_file_size is not crash safe The recent commit `4ca355d863` (MDEV-33894) caused a serious regression for online InnoDB ib_logfile0 resizing, breaking crash-safety unless the memory-mapped log file interface is being used. However, the log resizing was broken also before this. To prevent such regressions in the future, we extend the test innodb.log_file_size_online with a kill and restart of the server and with some writes running concurrently with the log size change. When run enough many times, this test revealed all the bugs that are being fixed by the code changes. log_t::resize_start(): Do not allow the resized log to start before the current log sequence number. In this way, there is no need to copy anything to the first block of resize_buf. The previous logic regarding that was incorrect in two ways. First, we would have to copy from the last written buffer (buf or flush_buf). Second, we failed to ensure that the mini-transaction end marker bytes would be 1 in the buffer. If the source ib_logfile0 had wrapped around an odd number of times, the end marker would be 0. This was occasionally observed when running the test innodb.log_file_size_online. log_t::resize_write_buf(): To adjust for the resize_start() change, do not write anything that would be before the resize_lsn. Take the buffer (resize_buf or resize_flush_buf) as a parameter. Starting with commit `4ca355d863` we no longer swap buffers when rewriting the last log block. log_t::append(): Define as a static function; only some debug assertions need to refer to the log_sys object. innodb_log_file_size_update(): Wake up the buf_flush_page_cleaner() if needed, and wait for it to complete a batch while waiting for the log resizing to be completed. If the current LSN is behind the resize target LSN, we will write redundant FILE_CHECKPOINT records to ensure that the log resizing completes. If the buf_pool.flush_list is empty or the buf_flush_page_cleaner() is stuck for some reason, our wait will time out in 5 seconds, so that we can periodically check if the execution of SET GLOBAL innodb_log_file_size was aborted. Previously, we could get into a busy loop here while the buf_flush_page_cleaner() would remain idle.	2024-08-29 14:53:08 +03:00
Marko Mäkelä	cfcf27c6fe	Merge 10.6 into 10.11	2024-08-29 07:47:29 +03:00
Marko Mäkelä	bda40ccb85	MDEV-34803 innodb_lru_flush_size is no longer used In commit `fa8a46eb68` (MDEV-33613) the parameter innodb_lru_flush_size ceased to have any effect. Let us declare the parameter as deprecated and additionally as MARIADB_REMOVED_OPTION, so that there will be a warning written to the error log in case the option is specified in the command line. Let us also do the same for the parameter innodb_purge_rseg_truncate_frequency that was deprecated&ignored earlier in MDEV-32050. Reviewed by: Debarun Banerjee	2024-08-28 07:18:03 +03:00
Marko Mäkelä	b7b9f3ce82	MDEV-34515: Contention between purge and workload In a Sysbench oltp_update_index workload that involves 1 table, a serious contention between the workload and the purge of history was observed. This was the worst when the table contained only 1 record. This turned out to be fixed by setting innodb_purge_batch_size=128, which corresponds to the number of usable persistent rollback segments. When we go above that, there would be contention between row_purge_poss_sec() and the workload, typically on the clustered index page latch, sometimes also on a secondary index page latch. It might be that with smaller batches, trx_sys.history_size() will end up pausing all concurrent transaction start/commit frequently enough so that purge will be able to make some progress, so that there would be less contention on the index page latches between purge and SQL execution. In commit `aa719b5010` (part of MDEV-32050) the interpretation of the parameter innodb_purge_batch_size was slightly changed. It would correspond to the maximum desired size of the purge_sys.pages cache. Before that change, the parameter was referring to a number of undo log pages, but the accounting might have been inaccurate. To avoid a regression, we will reduce the default value to innodb_purge_batch_size=127, which will also be compatible with innodb_undo_tablespaces>1 (which will disable rollback segment 0). Additionally, some logic in the purge and MVCC checks is simplified. The purge tasks will make use of purge_sys.pages when accessing undo log pages to find out if a secondary index record can be removed. If an undo page needs to be looked up in buf_pool.page_hash, we will merely buffer-fix it. This is correct, because the undo pages are append-only in nature. Holding purge_sys.latch or purge_sys.end_latch or the fact that the current thread is executing as a part of an in-progress purge batch will prevent the contents of the undo page from being freed and subsequently reused. The buffer-fix will prevent the page from being evicted form the buffer pool. Thanks to this logic, we can refer to the undo log record directly in the buffer pool page and avoid copying the record. buf_pool_t::page_fix(): Look up and buffer-fix a page. This is useful for accessing undo log pages, which are append-only by nature. There will be no need to deal with change buffer or ROW_FORMAT=COMPRESSED in that case. purge_sys_t::view_guard::view_guard(): Allow the type of guard to be acquired: end_latch, latch, or no latch (in case we are a purge thread). purge_sys_t::view_guard::get(): Read-only accessor to purge_sys.pages. purge_sys_t::get_page(): Invoke buf_pool_t::page_fix(). row_vers_old_has_index_entry(): Replaced with row_purge_is_unsafe() and row_undo_mod_sec_unsafe(). trx_undo_get_undo_rec(): Merged to trx_undo_prev_version_build(). row_purge_poss_sec(): Add the parameter mtr and remove redundant or unused parameters sec_pcur, sec_mtr, is_tree. We will use the caller's mtr object but release any acquired page latches before returning. btr_cur_get_page(), page_cur_get_page(): Do not invoke page_align(). row_purge_remove_sec_if_poss_leaf(): Return the value of PAGE_MAX_TRX_ID to be checked against the page in row_purge_remove_sec_if_poss_tree(). If the secondary index page was not changed meanwhile, it will be unnecessary to invoke row_purge_poss_sec() again. trx_undo_prev_version_build(): Access any undo log pages using the caller's mini-transaction object. row_purge_vc_matches_cluster(): Moved to the only compilation unit that needs it. Reviewed by: Debarun Banerjee	2024-08-26 12:23:06 +03:00
Marko Mäkelä	62bfcfd8b2	Merge 10.6 into 10.11	2024-08-14 11:36:52 +03:00
Marko Mäkelä	757c368139	Merge 10.5 into 10.6	2024-08-14 10:56:11 +03:00
Jan Lindström	cd8b8bb964	MDEV-34594 : Assertion `client_state.transaction().active()' failed in int wsrep_thd_append_key(THD, const wsrep_key, int, Wsrep_service_key_type) CREATE TABLE [SELECT\|REPLACE SELECT] is CTAS and idea was that we force ROW format. However, it was not correctly enforced and keys were appended before wsrep transaction was started. At THD::decide_logging_format we should force used stmt binlog format to ROW in CTAS case and produce a warning if used binlog format was not ROW. At ha_innodb::update_row we should not append keys similarly as in ha_innodb::write_row if sql_command is SQLCOM_CREATE_TABLE. Improved error logging on ::write_row, ::update_row and ::delete_row if wsrep key append fails. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-08-12 23:54:30 +02:00
Oleksandr Byelkin	0e8fb977b0	Merge branch '10.6' into 10.11	2024-08-03 09:15:40 +02:00
Oleksandr Byelkin	8f020508c8	Merge branch '10.5' into 10.6	2024-08-03 09:04:24 +02:00
Thirunarayanan Balathandayuthapani	533e6d5d13	MDEV-34670 IMPORT TABLESPACE unnecessary traverses tablespace list Problem: ======== - After the commit `ada1074bb1` (MDEV-14398) fil_crypt_set_encrypt_tables() iterates through all tablespaces to fill the default_encrypt tables list. This was a trigger to encrypt or decrypt when key rotation age is set to 0. But import tablespace does call fil_crypt_set_encrypt_tables() unnecessarily. The motivation for the call is to signal the encryption threads. Fix: ==== ha_innobase::discard_or_import_tablespace: Remove the fil_crypt_set_encrypt_tables() and add the import tablespace to the default encrypt list if necessary	2024-07-31 14:13:38 +05:30
Thirunarayanan Balathandayuthapani	cc8eefb0dc	MDEV-33087 ALTER TABLE...ALGORITHM=COPY should build indexes more efficiently - During copy algorithm, InnoDB should use bulk insert operation for row by row insert operation. By doing this, copy algorithm can effectively build indexes. This optimization is disabled for temporary table, versioning table and table which has foreign key relation. Introduced the variable innodb_alter_copy_bulk to allow the bulk insert operation for copy alter operation inside InnoDB. This is enabled by default ha_innobase::extra(): HA_EXTRA_END_ALTER_COPY mode tries to apply the buffered bulk insert operation, updates the non-persistent table stats. row_merge_bulk_t::write_to_index(): Update stat_n_rows after applying the bulk insert operation row_ins_clust_index_entry_low(): In case of copy algorithm, switch to bulk insert operation. copy_data_error_ignore(): Handles the error while copying the data from source to target file.	2024-07-30 11:59:01 +05:30
Monty	4bf7c966b3	MDEV-34664: Add an option to fix InnoDB's doubling of secondary index cardinalities (With trivial fixes by sergey@mariadb.com) Added option fix_innodb_cardinality to optimizer_adjust_secondary_key_costs Using fix_innodb_cardinality disables the 'divide by 2' of rec_per_key_int in InnoDB that in effect doubles the Cardinality for secondary keys. This has the biggest effect for indexes where a few rows has the same key value. Using this may also cause table scans for very small tables (which in some cases may be better than an index scan). The user visible effect is that 'SHOW INDEX FROM table_name' will for InnoDB show the true Cardinality (and not 2x the real value). It will also allow the optimizer to chose a better index in some cases as the division by 2 could have a bad effect for tables with 2-5 identical values per key. A few notes about using fix_innodb_cardinality: - It has direct affect for SHOW INDEX FROM table_name. SHOW INDEX will also update the statistics in table share. - The effect of fix_innodb_cardinality for query plans or EXPLAIN is only visible after first open of the table. This is why one must do a flush tables or use SHOW INDEX for the option to take effect. - Using fix_innodb_cardinality can thus affect all user in their query plans if they are using the same tables. Because of this, it is strongly recommended that one uses optimizer_adjust_secondary_key_costs=fix_innodb_cardinality mainly in configuration files to not cause issues for other users.	2024-07-29 16:40:53 +03:00
Alexander Barkov	4d71a117a3	Merge remote-tracking branch 'origin/10.6' into 10.11	2024-07-08 21:52:08 +04:00
Alexander Barkov	e56040fee8	Merge remote-tracking branch 'origin/10.5' into 10.6	2024-07-08 18:59:04 +04:00
Thirunarayanan Balathandayuthapani	834c013b64	MDEV-34519 innodb_log_checkpoint_now crashes when innodb_read_only is enabled During read only mode, InnoDB doesn't allow checkpoint to happen. So InnoDB should throw the warning when InnoDB tries to force the checkpoint when innodb_read_only = 1 or innodb_force_recovery = 6.	2024-07-05 15:26:05 +05:30
Oleksandr Byelkin	034a175982	Merge branch '10.6' into 10.11	2024-07-04 11:52:07 +02:00
Denis Protivensky	cfbd57dfb7	MDEV-33064: Sync trx->wsrep state from THD on trx start InnoDB transactions may be reused after committed: - when taken from the transaction pool - during a DDL operation execution In this case wsrep flag on trx object is cleared, which may cause wrong execution logic afterwards (wsrep-related hooks are not run). Make trx->wsrep flag initialize from THD object only once on InnoDB transaction start and don't change it throughout the transaction's lifetime. The flag is reset at commit time as before. Unconditionally set wsrep=OFF for THD objects that represent InnoDB background threads. Make Wsrep_schema::store_view() operate in its own transaction. Fix streaming replication transactions' fragments rollback to not switch THD->wsrep value during transaction's execution (use THD->wsrep_ignore_table as a workaround). Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-07-01 13:07:39 +02:00
Marko Mäkelä	1d76794aba	Merge 10.6 into 10.11	2024-06-28 16:03:28 +03:00
Marko Mäkelä	d1ecf5cc5f	MDEV-32176 Contention in ha_innobase::info_low() During a Sysbench oltp_point_select workload with 1 table and 400 concurrent connections, a bottleneck on dict_table_t::lock_mutex was observed in ha_innobase::info_low(). dict_table_t::lock_latch: Replaces lock_mutex. In ha_innobase::info_low() and several other places, we will acquire a shared dict_table_t::lock_latch or we may elide the latch if hardware memory transactions are available. innobase_build_v_templ(): Remove the parameter "bool locked", and require the caller to hold exclusive dict_table_t::lock_latch (instead of holding an exclusive dict_sys.latch). Tested by: Vladislav Vaintroub Reviewed by: Vladislav Vaintroub	2024-06-28 15:57:07 +03:00
Marko Mäkelä	4ca355d863	MDEV-33894: Resurrect innodb_log_write_ahead_size As part of commit `685d958e38` (MDEV-14425) the parameter innodb_log_write_ahead_size was removed, because it was thought that determining the physical block size would be a sufficient replacement. However, we can only determine the physical block size on Linux or Microsoft Windows. On some file systems, the physical block size is not relevant. For example, XFS uses a block size of 4096 bytes even if the underlying block size may be smaller. On Linux, we failed to determine the physical block size if innodb_log_file_buffered=OFF was not requested or possible. This will be fixed. log_sys.write_size: The value of the reintroduced parameter innodb_log_write_ahead_size. To keep it simple, this is read-only and a power of two between 512 and 4096 bytes, so that the previous alignment guarantees are fulfilled. This will replace the previous log_sys.get_block_size(). log_sys.block_size, log_t::get_block_size(): Remove. log_t::set_block_size(): Ensure that write_size will not be less than the physical block size. There is no point to invoke this function with 512 or less, because that is the minimum value of write_size. innodb_params_adjust(): Add some disabled code for adjusting the minimum value and default value of innodb_log_write_ahead_size to reflect the log_sys.write_size. log_t::set_recovered(): Mark the recovery completed. This is the place to adjust some things if we want to allow write_size>4096. log_t::resize_write_buf(): Refer to write_size. log_t::resize_start(): Refer to write_size instead of get_block_size(). log_write_buf(): Simplify some arithmetics and remove a goto. log_t::write_buf(): Refer to write_size. If we are writing less than that, do not switch buffers, but keep writing to the same buffer. Move some code to improve the locality of reference. recv_scan_log(): Refer to write_size instead of get_block_size(). os_file_create_func(): For type==OS_LOG_FILE on Linux, always invoke os_file_log_maybe_unbuffered(), so that log_sys.set_block_size() will be invoked even if we are not attempting to use O_DIRECT. recv_sys_t::find_checkpoint(): Read the entire log header in a single 12 KiB request into log_sys.buf. Tested with: ./mtr --loose-innodb-log-write-ahead-size=4096 ./mtr --loose-innodb-log-write-ahead-size=2048	2024-06-27 16:38:08 +03:00
Marko Mäkelä	27a3366663	Merge 10.6 into 10.11	2024-06-27 10:26:09 +03:00
Marko Mäkelä	0076eb3d4e	Merge 10.5 into 10.6	2024-06-24 13:09:47 +03:00
Marko Mäkelä	acc077ffa1	MDEV-34443 ha_innobase::info_low() does not distinguish HA_STATUS_VARIABLE_EXTRA ha_innobase::info_low(): For HA_STATUS_VARIABLE without HA_STATUS_VARIABLE_EXTRA, let us avoid unnecessary and costly updates of the data_free statistics, which are only needed for SHOW TABLE STATUS. This optimization had been enabled in commit `247ecb7597` but not utilized until now.	2024-06-24 10:39:13 +03:00
Jan Lindström	ee974ca5e0	MDEV-31658 : Deadlock found when trying to get lock during applying Problem was that there was two non-conflicting local idle transactions in node_1 that both inserted a key to primary key. Then two transactions from other nodes inserted also a key to primary key so that insert from node_2 conflicted one of the local transactions in node_1 so that there would be duplicate key if both are committed. For this insert from other node tries to acquire S-lock for this record and because this insert is high priority brute force (BF) transaction it will kill idle local transaction. Concurrently, second insert from node_3 conflicts the second idle insert transaction in node_1. Again, it tries to acquire S-lock for this record and kills idle local transaction. At this point we have two non-conflicting high priority transactions holding S-lock on different records in node_1. For example like this: rec s-lock-node2-rec s-lock-node3-rec rec. Because these high priority BF-transactions do not wait each other insert from node3 that has later seqno compared to insert from node2 can continue. It will try to acquire insert intention for record it tries to insert (to avoid duplicate key to be inserted by local transaction). Hower, it will note that there is conflicting S-lock in same gap between records. This will lead deadlock error as we have defined that BF-transactions may not wait for record lock but we can't kill conflicting BF-transaction because it has lower seqno and it should commit first. BF-transactions are executed concurrently because their values to primary key are different i.e. they do not conflict. Galera certification will make sure that inserts from other nodes i.e these high priority BF-transactions can't insert duplicate keys. Local transactions naturally can but they will be killed when BF-transaction acquires required record locks. Therefore, we can allow situation where there is conflicting S-lock and insert intention lock regardless of their seqno order and let both continue with no wait. This will lead to situation where we need to allow BF-transaction to wait when lock_rec_has_to_wait_in_queue is called because this function is also called from lock_rec_queue_validate and because lock is waiting there would be assertion in ut_a(lock->is_gap() \|\| lock_rec_has_to_wait_in_queue(cell, lock)); lock_wait_wsrep_kill Add debug sync points for BF-transactions killing local transaction. wsrep_assert_no_bf_bf_wait Print also requested lock information lock_rec_has_to_wait Add function to handle wsrep transaction lock wait cases. lock_rec_has_to_wait_wsrep New function to handle wsrep transaction lock wait exceptions. lock_rec_has_to_wait_in_queue Remove wsrep exception, in this function all conflicting locks need to wait in queue. Conflicts between BF and local transactions are handled in lock_wait. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-06-19 14:09:11 +02:00
Marko Mäkelä	b81d717387	Merge 10.6 into 10.11	2024-06-11 12:50:10 +03:00
Marko Mäkelä	27834ebc91	Merge 10.5 into 10.6	2024-06-10 15:22:15 +03:00
Thirunarayanan Balathandayuthapani	b7a75fbb8a	MDEV-34169 Don't allow innodb_open_files to be lesser than number of non-user tablespace. - InnoDB only closes the user tablespace when the number of open files exceeds innodb_open_files limit. In that case, InnoDB should make sure that innodb_open_files value should be greater than number of undo tablespace, system and temporary tablespace files.	2024-06-07 15:37:11 +05:30
Marko Mäkelä	699d38d951	MDEV-34296 extern thread_local is a CPU waste In commit 99bd22605938c42d876194f2ec75b32e658f00f5 (MDEV-31558) we wrongly thought that there would be minimal overhead for accessing a thread-local variable mariadb_stats. It turns out that in C++11, each access to an extern thread_local variable requires conditionally invoking an initialization function. In fact, the initializer expression of mariadb_stats is dynamic, and those calls were actually unavoidable. In C++20, one could declare constinit thread_local variables, but the address of a thread_local variable (&mariadb_dummy_stats) is not a compile-time constant. We did not want to declare mariadb_dummy_stats without thread_local, because then the dummy accesses could lead to cache line contention between threads. mariadb_stats: Declare as __thread or __declspec(thread) so that there will be no dynamic initialization, but zero-initialization. mariadb_dummy_stats: Remove. It is a lesser evil to let the environment perform zero-initialization and check if !mariadb_stats. Reviewed by: Sergei Petrunia	2024-06-06 14:38:42 +03:00
Thirunarayanan Balathandayuthapani	58a0e1e3dd	MDEV-34223 Innodb - add status variable for number of bulk inserts - Added a counter innodb_num_bulk_insert_operation in INFORMATION_SCHEMA.GLOBAL_STATUS. This counter is incremented whenever a InnoDB undergoes bulk insert operation. - Change the innodb_instant_alter_column to atomic variable.	2024-06-03 16:27:22 +05:30
Vladislav Vaintroub	736449d30f	MDEV-34205: ASAN stack buffer overflow in strxnmov() in frm_file_exists Correct the second parameter for strxnmov to prevent potential buffer overflows. The second parameter must be one less than the size of the input buffer to avoid writing past the end of the buffer. While the second parameter is usually correct, there are exceptions that need fixing. This commit addresses the issue within frm_file_exists() and other affected places.	2024-05-23 22:08:27 +02:00
Sergei Golubchik	0aae11ac28	Merge branch '10.6' into 10.11	2024-04-30 16:56:49 +02:00
Thirunarayanan Balathandayuthapani	8c8b7da017	MDEV-33979 Disallow bulk insert operation during partition update statement Problem: ======== - Partition update operation enables the bulk insert for the transaction while moving the row between partitions. This leads to debug assert failure while removing the row from one of the partition. Solution: ======== - Disallow the bulk insert operation for non-insert operation of partition table.	2024-04-25 10:50:34 +05:30
Sergei Golubchik	018d537ec1	Merge branch '10.6' into 10.11	2024-04-22 15:23:10 +02:00
Marko Mäkelä	e459ce8336	MDEV-33779 InnoDB row operations could be faster We have quite a few assertions ut_a(m_prebuilt->trx == thd_to_trx(ha_thd())); in low-level functions. These had better be debug assertions for performance reasons. It should suffice to check that condition in the less frequently invoked ha_innobase::change_active_index(). convert_search_mode_to_innobase(): Return whether the mode is unsupported, and optionally update ha_innobase::m_last_match_mode. ha_innobase::index_read(): Only branch on find_flag once, and simplify the error handling after invoking row_search_mvcc(). ha_innobase::rnd_pos(): Remove an assertion that is duplicating one in ha_innobase::index_read(), which we are calling unconditionally. ha_innobase::records_in_range(): Check only once whether min_key, max_key are null pointers. row_sel_convert_mysql_key_to_innobase(): Declare all parameters except the conversion buffer pointer (buf) to be nonnull. Reviewed by: Debarun Banerjee	2024-04-17 16:47:41 +03:00
Marko Mäkelä	829cb1a49c	Merge 10.5 into 10.6	2024-04-17 14:14:58 +03:00
Kristian Nielsen	16aa4b5f59	Merge from 10.4 to 10.5 Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-04-15 17:46:49 +02:00
Sergei Golubchik	41296a07c8	Merge branch '10.5' into 10.6	2024-04-11 13:58:22 +02:00
Marko Mäkelä	0892e6d028	MDEV-33585 The maximum innodb_log_buffer_size is too large On Microsoft Windows, ReadFile() as well as WriteFile() limit the size of the request to DWORD, which is 32 bits (at most 4 GiB - 1) also on 64-bit systems. On FreeBSD, sysctl debug.iosize_max_clamp could limit the size of a write request to INT_MAX. The size of a read request is always limited to INT_MAX. This would allow the request size to be 4095 bytes more than the Linux limit (0x7ffff000 according to "man 2 read" and "man 2 write"). On OpenBSD, Solaris and possibly NetBSD, the read request size is limited to SSIZE_T_MAX, which would be half the current maximum innodb_log_buffer_size. This should be not much of an issue anyway, because on contemporary 64-bit platforms, the virtual addresses are limited to 48 bits. IBM AIX documentation mentions OFF_MAX which would apply when a 64-bit application is running on a 32-bit kernel. Let us declare innodb_log_buffer_size as 32-bit unsigned and make the maximum 0x7ffff000, to be compatible with the least common denominator (Linux). The maximum innodb_sort_buffer_size already was 64 MiB, which is not a problem. SyncFileIO::execute(): Assert that the size of a synchronous read or write request is limited to the maximum. Reviewed by: Vladislav Vaintroub	2024-04-09 09:32:47 +03:00
Thirunarayanan Balathandayuthapani	188c5da72a	MDEV-32453 Bulk insert fails to apply when trigger does insert operation Reason: ======= - InnoDB fails to apply the buffered insert operation if the after insert trigger does change the same table. This behaviour leads to empty table for the subsequent insert operation and server abort. Solution: ======== - InnoDB should apply buffered insert operation if "after insert" trigger changes the same table.	2024-04-08 14:24:20 +05:30
sjaakola	2fcf2ec229	MDEV-33749 hyphen in table name can cause galera certification failures Fix in this commit handles foreign key value appending into write set so that db and table names are converted from the filepath format to tablename format. This is compatible with key values appended from elsewhere in the code base There is a mtr test galera.galera_table_with_hyphen for regression testing Reviewer: monty@mariadb.com	2024-04-04 17:12:09 +03:00
Marko Mäkelä	788953463d	Merge 10.6 into 10.11 Some fixes related to commit `f838b2d799` and Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row() for system-versioned tables were provided by Nikita Malyavin. This was required by test versioning.rpl,trx_id,row.	2024-03-28 09:16:57 +02:00

1 2 3 4 5 ...

3,327 commits